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5 Descrip tion 

Background 

Recent reports have documented the importance of responses to the 
Group I and Group II allergens in house dust mite allergy. For example it has 
10 been documented that over 60% of patients have at least 50% of their anti-mite 
antibodies directed towards these proteins (Lind, P. s^L, Allergy 22:259-274 
(1984); van der Zee, J.S. sLaL, J. Allergy Plin T^n^l ^$4-^96 (1988)) It 

is possible that children show a greater degree of reactivity (Thompson PJ etal 
Immunolo g y M:3 11-314 (1988)). Allergy to mites of the genus 

15 Permatpphagoi des (D.) is associated with conditions such as asthma, rhinitis and 
ectop lc dermatitis. Two species, P, pterony^nns and p, f ari ng predominate 
and, as a result, considerable effort has been expended in trying to identify the 
allergens produced by these two species. D. ntemnv^n,,. mites m me most 
common PermaTophagoiri ^ species in house dust in Western Europe and 

20 Australia. The species P . farinae predominates in other countries, such as North 
Amenca and Japan (Wharton, G.W., J, Medical Entom, 12:577-621 (1976)) It 
has long been recognized that allergy to mites of this genus is associated with 
diseases such as asthma, rhinitis and atopic dermatitis. It is still not clear what 
allergens produced by these mites are responsible for the allergic response and 

25 associated conditions. 

Summary of fh» I nvention 

The present invention relates to isolated PNA which encodes a 
protein allergen of Damatophnpoid^ ((P.) house dust mite) or a peptide which 

30 includes at least one epitope of a protein allergen of a house dust mite of the genus 
Permatophagoid^ . It particularly relates to PNA encoding major allergens of the 
species P, farinae , designated Denfl and Per_f II, or portions of these major 
allergens (i.e., peptides which include at least one epitope of Per f T or of Dsxf II) 
It also particularly relates to PNA encoding major allergens of P. mernnv^ny c ' 

'5 designated Deng I andPsLp, II, or portions of these major allergens (i.e., peptides 
which include at least one epitope of Dsr_p I or of Per p II). 
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The present invention further relates to proteins and peptides encoded by 
the isolated pemphigoides (e.g., P, farinae , P pteronvssinus ) DNA including 
proteins containing sequence polymorphisms. Several nucleotide and resulting amino 
acid sequence polymorphisms have been discovered in the Dsrjz I, DSLP II and Dexi 
5 II allergens. All such nucleotide variations and proteins, or portions thereof, 
containing a sequence polymorphism are within the scope of the invention. 

Peptides of the present invention include at least one epitope of a 
n farinae allergen (e.g., at least one epitope of Dstil or DfinJII) or at least one 
epitope ~f*n ptPrnnvssinus allergen (e.g., at least one epitope of Dsrjj I or of DfiU2 
io II). It also relates to antibodies specific for P , farinae proteins or peptides and to 
antibodies specific for P ptermwssinus proteins or peptides. 

ngrmato phagoides DNA, proteins and peptides of the present invention 
are useful for diagnostic and therapeutic purposes. For example, isolated P- farinae 
proteins or peptides can be used to detect sensitivity in an individual to house dust 
is mites and can be used to treat sensitivity (reduce sensitivity or desensitize) in an 
individual, to whom therapeutically effective quantities of the P . farinae protein or 
peptide is administered. For example, isolated IlJmnafi protein allergen, such as 
Derf l or Der_f II, can be administered periodically, using standard techniques, to an 
individual in order to desensitize the individual. Alternatively, a peptide which 
20 includes at least one epitope of DsiH or of Dsiill can be administered for this 

purpose. Isolated n pteronvssinus protein allergen, such as Dsr_p I or Dsrj2 II, can be 
administered as described for D_£l_f I or UslI H. Similarly, a peptide which includes 
at least one Derji I epitope or at least one Dexji II epitope can be administered for this 
purpose. A combination of these proteins or peptides (e.g., D_SE_f I and Dsr_f II; DfiLP 
25 I and DfiU2 II; or a mixture of both Dsxf and DfiUJ proteins) can also be administered. 
The use of such isolated proteins or peptides provides a means of desensitizing 
individuals to important house dust mite allergens. 



30 



35 



Rrief Descri ptor r>f the Drawings 

fi bres 1 A and IB show the nucleotide and predicted amino acid 
sequence of cDNA ggtl 1 pl(13T) (SEQ ID NOS: 1 and 2, respectively). Numbers to 
the rieht are nucleotide positions whereas numbers above the sequence are amino acid 
positions. Positive amino acid residue numbers correspond to the sequence of the 
mature excreted D_er_p. 1 beginning with threonine. Negative sequence numbers refer 
to the proposed transient pre- and preproenzyme forms of Derji I- The arrows 
indicate the beginning of the proposed proenzyme sequence and the mature Dsr_p I, 
respectively. Residues -15 to -13 enclosed by an open box make 
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up the proposed cleavage for the proenzyme formation, and the dashed residues 52-54 
represent a potential N-glycosylation site. The termination TAA codon and the 
adjacent polyadenylation signal are underlined. Amino acid residues 1-41, 79-95. 
1 1 1-142, and 162-179 correspond to known tryptic peptide sequences determined by 
conventional amino acid sequencing analysis. 

Figure 2 shows the restriction map of the cDNA insert of clone ggtl 1 
pl(13T) and the strategy of DNA sequencing. Arrows indicate directions in 
which sequences were read. 

Figure 3 is a comparison of N-terminal sequences of Per p I and 
P_gr_f I. The amino acid sequence for Derp I is equivalent to amino acids 1-20 in 
Figures 1A and IB; the P er f l sequence is from reference (12). 

Eigure4 shows the reactivity of ggtl 1 pl(13T) with anti- Derp I. 
Lysates from Y 1 089 lysogens induced for phage were reacted by dot-blot with 
rabbit anti-Dsiip I (Dgr_p. I) or normal rabbit serum (Nrs). Dots (2ml) were made 
in triplicate from lysates of bacteria infected with ggtl 1 pi (13T) (a) or ggtl 1 (b). 
When developed with 125^0^ A autoradiography only the reaction 
between ggtl 1 pl(13T) lysate and the anti-D_er_p I showed reactivity. 

Figure 5 shows reaction of clone pGEX-pl(13T) with IgE in allergic 
serum. Overnight cultures of pGEX or pGEX-pl where diluted 1/10 in broth and 
grown for 2 hours at 37°C. They were induced with IPTG, grown for 2 hours at 
37°C. The bacteria were pelletted and resuspended in PBS to 1/10 the volume of 
culture media. The bacteria were lysed by freeze/thaw and sonication. A 
radioimmune dot-blot was performed with 2ml of these lysates using mite-allergic 
or non-allergic serum. The dots in row 1 were from & qqH containing pGEX and 
row 2-4 from different cultures of goli infected with pGEX-pl(13T). 
Reactivity to pGEX-pl(13T) was found with IgE in allergic but not non-allergic 
serum. No reactivity to the vector control or with non-allergic serum was found. 

Figure 6 shows seroreactivity of cDNA clones coding for Der p II in 
plaque radioimmune assay. Segments of nitrocellulose filters from plaque lifts 
were taken from clones 1, 3, A, B and the vector control Ampl. These were 
reached by immunoassay for human IgE against allergic serum (AM) in row 1, 
non-allergic serum (WT) in row 2 and by protein A immunoassay for Derp I with 
rabbit antiserum in row 3. The clones 1, 3 and B reacted strongly with allergic 
serum but not non-allergic or vector control. (Clone B and vector control were 
not tested with non-allergic serum). 
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Fi gures 7 A and 7B show the nucleotide and predicted amino acid 
sequence of cDNA of ggtll p II (CI) (SEQ ID NOS: 3 and 4, respectively). Numbers 
to the right are nucleotide positions and numbers above are amino acid positions. 
Positive numbers for amino acids begin at the known N-terminal of USL£ II and 

5 match the known sequence of the first 40 residues. Residues -1 to -16 resemble a 
typical leader sequence with a hydrophobic core. 

Fi gure 8 shows the N-terminal amino acid homology of Per p II and UslI 
II. ( Per f II sequence from reference 30). 

Fi gure 9 is a restriction map of the cDNA insert of clone ggtll f 1, 

io including a schematic representation of the strategy of DNA sequencing. Arrows 
indicate directions in which sequences were read. 

Figures 10A and 10B are the nucleotide sequence and the predicted 
amino acid sequence of cDNA ggtll f 1 (SEQ ID NOS: 5 and 6, respectively). 
Numbers above are nucleotide positions; numbers to the left are amino acid positions. 

15 Positive amino acid residue numbers correspond to the sequence of the mature 

excreted Per f l beginning with threonine. Negative sequence numbers refer to the 
signal peptide and the proenzyme regions of P_£Lf I. The arrows indicate the 
beginning of the proenzyme sequence and the mature Deri" I, respectively. The 
underlined residues -81 to -78 make up the proposed cleavage site for the proenzyme 

20 formation, while the underlined residues 53-55 represent a potential N-glycosylation 
site. The termination TGA codon and the adjacent polyadenylation signal are also 
underlined. Amino acid residues 1-28 correspond to a known tryptic peptide sequence 
determined by conventional amino acid sequencing analysis. 

Figure 1 1 is a composite alignment of the amino acid sequences of the 

25 mature Perp I (SEQ ID NO: 1 1 ) and Derf I proteins. The numbering above the 
sequence refers to Perp I. The asterisk denotes the gap that was introduced for 
maximal alignment. The symbol (.) is used to indicate that the amino acid residue of 
Per f I at that position is identical to the corresponding amino acid residue of USLp. I- 
The arrows indicate those residues making up the active site of Dsr_p. I and P_exJT. 

30 Figures 12A and 12 B are a comparison of the amino acid sequence in the 

pre- and pro-peptide regions of Per f I with those of rat cathepsin H, rat cathepsin L, 
papain, aleurain. CP1. CP2, rat cathepsin B, CTLA-2, MCP, P_£r_p I and actinidin. 
Gaps, denoted by dashes, were added for maximal alignment. Double asterisks denote 
conserved amino acid residues which are shared by greater than 80% of the 

35 proenzymes: single asterisks show residues which are conserved in greater than 55% 
of the sequences. The symbol (.) is used to denote semiconserved equivalent amino 
acids which are shared by greater than 90% of the proenzyme regions. 
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Figures 13A and HR are a hydrophilicity plot of the Derp I mature 
protein and a hydrophilicity plot of the UslI I mature protein produced using the 
Hopp- Woods algorithm computed with the Mac Vector Sequence Analysis Software 
(IBI. New Haven) using a 6 residue window. Positive values indicate relative 
hydrophilicity and negative values indicating relative hydrophobicity. 

F 'g ure 14 is the nucleotide sequence and the predicted amino acid 
sequence of DfiLf II cDNA (SEQ ID NOS: 7 and 8, respectively). Numbers to the 
right are nucleotide positions and numbers above are amino acid residues. The stop 
(TAA) signal is underlined. The first 8 nucleotides are from the oligonucleotide 
primer used to generate the cDNA, based on the D_£r_p_ II sequence. 

Fd£ure_15 is a restriction map of DerJTI cDNA, which was generated by 
computer from the sequence data. A map of DerjJ II similarly generated is shown for 
comparison. There are few common restriction enzyme sites conserved. Sites marked 
with an asterisk were introduced by cloning procedures. 

Figures 16A, 16B,and 16C show the alignment of D_£r_f II and Derp II 
cDNA sequences. Numbers to the right are nucleotide position and numbers above 
are amino acid residues. The top line gives Dsrj2 II nucleotide sequence and the 
second the Der_p II amino acid residues. The next two lines show differences of Dsr_f 
II to these sequences. 

F i gures ]7A and 17R are hydrophilicity plots of Djjr_f. II and Derp II 
using the Hopp- Woods algorithm computed with the Mac Vector Sequence Analysis 
Software (IBI, New Haven) using a 6-residue window. 

F 'g ure 18 is a composite alignment of the amino acid sequences of five 
Derp I clones (a)-(e) which illustrates polymorphism in the Derp I protein (SEQ ID 
NO: 11). The numbering refers to the sequence oftheDsLp 1(a) clone. The symbol (- 
) is used to indicate that the amino acid residue of a D£T_d I clone is identical to the 
corresponding amino acid residue of Derj2 1(a) at that position. The amino acid 
sequences of these clones indicate that there may be significant variation in UsiM I 
with five polymorphic amino acid residues found in the five sequences. 

F 'g ure 19 is a composite alignment of the amino acid sequences of three 
DsiM II clones (c). ( 1 ) and (2) which illustrates polymorphism in the Der_p II protein 
The numbering refers to the sequence of the Qgrj, 11(c) clone. The symbol (.) is used 
to indicate that the amino acid residue of a Dslp II clone is identical to the 
correspondng amino acid residue of D_er_p II (c) at that position. 

Eigure2£) is a composite alignment of the amino acid sequences of six 
DSLlU clones (i.e.. pFLl, pFL2, MT3, MT5, MT18 and MT16) which illustrates 
polymorphism in the Dgrf II protein (SEQ ID NO: 13). The numbering 
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refers to the sequences of the Per f pFLl clone. The symbol (.) is used to indicate 
that the amino acid residue of a Per f II clone is identical to the corresponding 
amino acid residue of DslI II pFLl at that position. 

Fi gures 21A 71B_and21C are the nucleotide and predicted amino 
5 acid sequences of cPNA ggtl 1 p 1 ( 1 3T) (SEQ ID NOS: 9 and 10, respectively), 
including the full length of the preproenzyme form of P_£r_p_ I. Negative sequence 
numbers refer to the proposed pre- and preproenzyme forms of P_ejr_p. I. 



Detailed Descripti np nf the Invention 

io The present invention relates to a nucleotide sequence coding for an 

allergen from the house dust mite Permatophagoides and to the encoded 
Dermatophapoides protein or peptide which includes at least one epitope of the 
Permatophapoides allergen. It particularly relates to a nucleotide sequence 
capable of expression in an appropriate host of a major allergen of P, farinae, such 

15 as P_sr_f I or Per f II. or of a peptide which includes at least one epitope of D_ejr_f I 
or of P_§r_f II. It also particularly relates to a nucleotide sequence capable of 
expression in an appropriate host of a major allergen of P, pteronyssinus, such as 
Derp I or Perp II, or of a peptide which includes at least one epitope of P_gr_p I or 
of P_er_p II. The Dermatophafoides nucleotide sequence is useful as a probe for 

20 identifying additional nucleotide sequences which hybridize to it and encode other 
mite allergens, particularly P. farinae or P. pteronyssinus allergens. Further, the 
present invention relates to nucleotide sequences which hybridize to a P, farinae 
protein-encoding nucleotide sequence or a P. pteronyssinus protein-encoding 
nucleotide sequence but which encode a protein from another species or type of 

25 house dust mite, such as P microceras (e.g., P_exm I and Perm H). 

The encoded Permatophagoides mite allergen or peptide which 
includes at least one Permatnphapoides (Per fl or P_erf II; P_er_n I or Derp. II) 
epitope can be used for diagnostic purposes (e.g., as an antigen) and for therapeutic 
purposes (e.g., to desensitize an individual). Alternatively, the encoded house dust 

30 mite allergen can be a protein or peptide, such as a P, microceras protein or 
peptide, which displays the antigenicity of or is cross-reacitve with a P_erf or a 
Perp allergen; generally, these have a high degree of amino acid homology. 

Accordingly, the present invention also relates to compositions which 
include a Permatophapoides allergen (e.g., Derf I allergen, Dsr_f II allergen; P_gr_p_ 

35 I or Derp II allergen or other P., allergen cross-reactive therewith) or a peptide 
which includes at least one epitope nf a Permatophagoides allergen (Dsr_f I, Dsr_f 
II, Derp I, Perp II or other P_ allergen cross-reactive therewith) individually or in 
combination, and which can be used for therapeutic applications 
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(e.g., desensitization). As is described below, DNA coding for major allergens from 
house dust mites have been isolated and sequenced. In particular, and as is described 

Z^dTtu^ " the h Exa 7' eS - CDNA dones codmg for ^ ^ °. BBLf I 

and DfiLf II allergens have been isolated and sequenced. The nucleotide sequence of 
each of these clones has been compared with that of the homologous allergen from the 
related mite species (i.e., DfiLD I and Deri* I; DfiLE II and DfiLfU), as has the 
predicted amino acid sequence of each. 

The following is a description of isolation and sequencing of the two 
cDNA clones coding for Deiif allergens and their comparison with the corresponding 
P, pteronyssmns allergen and a description of use of the nucleotide sequences and 
encoded products in a diagnostic or a therapeutic context. 

Isolation and Seqnp nce Analysis of Per fj 

A cDNA clone coding for DfiLf I, a major allergen from the house dust 
mite p f farinae , has been isolated and sequenced. A restriction map of the cDNA 
insert of the clone is represented in Figure 9, as is the strategy of DNA sequencing 
This Dsr_f I cDNA clone contains a 1.1-kb cDNA insert encoding a typical signal 
peptide, a proenzyme region and the mature Dsnfl protein. The product is 321 amino 
acid residues; a putative 1 8 residue signal peptide, an 80 residue proenzyme (pro- 
peptide) region, and a 223 residue mature enzyme region. The derived molecular 
weight is 25,191 . The nucleotide sequence and the predicted amino acid sequence of 
the Derf I cDNA are represented in Figures 10A and 10B. The deduced amino acid 
sequence shows significant homology to other cysteine proteases in the pro-region as 
well as m the mature protein. Sequence alignment of the mature UslI I protein with 
the homologous allergen Dey, I from the related mite D. nternnv^n„c (Figure , , } 
revealed a high degree of homology (81%) between the two proteins, as predicted by 
prev.ous sequencing at the protein level. In particular, the residues comprising the 
active site of these enzymes were conserved and a potential N-glycosylation site was 
present at equivalent positions in both mite allergens. 

Conserved cysteine residue pairs (31, 71) and (65, 103), where the 
numbering refers to Dexj, I. are apparently involved in disulphide bond formation on 
the basis of the assumed similarity of the three dimensional structure of Dfirp I and 
Derf I to that of papain and actinidin, which also have an additional disulphide 
bridge. The fifth and final cysteine residue for which there is a homologous cysteine 
residue m papain and actinidin is the active site cysteine (residue 35 in p_£r_f I) It is 
not unlikely that the two extra cysteine residues present in Dsld I and Dsnf I may be 
involved in forming a third disulphide bridge. 
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The potential N-glycosylation site in Perp I is also present at the 
equivalent position in Der_f I with conservation of the crucial first and last residues of 
the tripeptide site. The degree of 

of USLl I and Q£r_p I has yet to be determined. Carbohydrates, including mannose, 
5 galactose, N-acetylglucosamine and N-acetylgalactosamine, have been reported in 
purified preparations of these mite allergens (Chapman, M.D., J, Immunol, , 125:587- 
592 (1980); Wolden, S. eLal, Tnt Arch Allergy Appl Immunol, , 6£:144-151 (1982)). 

Given the degree of homology over the first thirty N-terminal amino acid 
residues between mature Qer_p I and Der_m I (70%) and mature Der_fl and DfiLm I 
10 (97%) with the DsLm I residues determined by conventional amino acid sequencing 
(Platts-Mills TAE eLal, In: Mite AlfrrfTY . a World-Wide Problem , 27-29 (1988); 
Lind, P. and N. Horn, In: Mjt <* AlWpv a World-Wide Prob l em , 30-34 (1988)), it is 
probable that the full mature QeLm I sequence will confirm an overall 70-80% 
homology between the Group I mite allergens. Derjii I is an allergen from a 
15 microceras . High homology between the proenzyme moieties of DSLp I and Dsr_f I 
(91%) over the residues -23 to -1 and the structural analysis of Der_f I suggests that 
the Group I allergens are likely to have N-terminal extension peptides of the mature 
protein of homologous structure and, at least for the pro-peptide, composition. 

Studies on the fine structure of the design of signal sequences have 
20 identified three structurally dissimilar regions so far: a positively charged N-terminal 
(n) region, a central hydrophobic (h) region and a more polar C-termmal (c) region 
that seems to define the cleavage site (Von Heijne, G., EMB&JL, 2:23 15-2323 (1984); 
F„r I Riochem. . 122:17-21 (1983); J. Mol, Biol,, 1M:99-105 (1985)). Analysis of 
the signal peptide of Dexf I revealed that it, too, contained these regions (Figures 12A 
25 and 12B) The n-region is extremely variable in length and composition, but its net 
charge does not vary appreciably with the overall length, and has a mean value of 
about +1.7. The n-region of the DerJT signal peptide, with a length of two residues, 
has a net charge of +2 contributed by the initiator methionine (which is unformylated 
and hence positivelv charged in eukaryotes) and the adjacent lysine (Lys) residue. 
30 The h-region of Dsxl I is enriched with hydrophobic residues, the characteristic 

feature of this reeion, with only one hydrophilic residue serine (Ser) present which can 
be tolerated The overall amino acid composition of the DslSI c-region is more polar 
than that of the h-region as is found in signal sequences with the h/c boundary located 
between residues .6 and -5, which is its mean position in eukaryotes. Thus, the QgrJT 
35 pre-peptide sequence appears to fulfill the requirements to which a functional signal 
sequence must conform. 
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While the signal sequence of UslII and other cysteine proteases share 
structural homology, all being composed of the n,h and c-regions, they are highly 
variable with respect to overall length and amino acid sequence, as is clear in Figures 
12A and 12B. However, significant sequence homology has been shown between the 
pro-regmns of cysteine protease precursors (Ishidoh, K. sLaL, FEBS lettm 226:33- 
37 (1987)). Alignment of the proenzyme regions of Deif I and a number of other 
cysteine proteases (Figures 12A and 12B) indicated that these proregions share a 
number of very conserved residues as well as semi-conserved residues which were 
present in over half of the sequences. This homology was increased if conservative 
amino acids such as valine (Val), isoleucine (He) and leucine (Leu) (small 
hydrophobic residues) or arginine (Arg) and Lys (positively charged residues) were 
regarded as identical. The DslI I proregion possessed six out of seven highly 
conserved amino acids and all the residues at sites of conservative changes. The 
homology at less conserved sites was lower. Homology in the pro-peptide, in 
particular the highly conserved residues, may be important when considering the 
function of the pro-peptide in the processing of these enzymes, since it indicates that 
these sequences probably have structural and functional similarities. 

Highly cross-reactive B cell epitopes on Denf I and DfiLp I have been 
demonstrated using antibodies present in mouse, rabbit and human sera (Heymann 
P.W. £LaL, J, I mmunol. I3_Z:284 1-2847 (1986); Platts-Mills, TAE ei_al 
J, AUerpy Clin Immunol 7£:398-407 (1986)). However, species-specific epitopes 
have also been defined in these systems. Murine monoclonal antibodies bound 
predominantly to species-specific determinants (Platts-Mills TAE eJLaL 
J. Allerpy dir. Immunol m 1479-1484 (1987)). Some 40% of rabbit anti-Derp I 
reactivity was accounted for by epitopes unique to Derjj I (Platts-Mills, TAEfiLaL L 
Allergy Clin Immunol Z£:398-407 (1986)), and some species-specific binding of ' 
antibodies from allergic humans was observed, although the majority bind to cross- 
reacnve epitopes (Platts-Mills TAE sUL, LlmnmnoL 132: 1479-1484 (1987)). 

The recombinant DNA strategy of gene fragmentation and expression 
was used (Greene, W.K. et^L, Immunol (1990)) to define five antigenic regions of 
recombinant Dejip I which contained B cell epitopes recognized by a rabbit anti-Der p 
I antiserum. Using the technique of immunoabsorption, three of these putative 
epitopes were shown to be shared with DexiT (located on regions containing amino 
acid residues 34-47, 60-72 and 166-194) while two appeared to be specific for Qsro I 
(regions 82-99 and 1 12-140). Differences in the reactivity of these peptides to rabbit 
anti-ajarinag supported the above division into cross-reactive and species-specific 
epitopes. The sequence differences shown between 
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the Derp I and the Per f l proteins are primarily located in the N and C terminal 
regions, as well as in an extended surface loop (residues 85-136) linking the two 
domains of the enzyme that includes helix D (residues 127-136), as predicted from the 
secondary and tertiary structures of papain and actinidin (Baker, E.N. and J. Drenth, 
5 n H » piMl Macrn n^ii^ ™* Assemblies. Vol. 3, pp. 3 14-368, John Wiley and 

Sons, NY (1987)). The surface location of these residues is supported by the 
hydrophilicity plots of UsUl I and Der_f I in Figures 13A and 13B, which illustrate the 
predominantly hydrophilic nature of this region that predicts surface exposure. This 
region also contains the two species-specific B cell epitopes recognized by the rabbit 
10 anti-Derji I serum (see above). Analysis of the sequences in the regions containing 
the cross-reactive epitopes (located in regions 34-47 and 60-72) are completely 
conserved between Qer_p_ I and DsrJI, while the majority of residues in a third cross- 
reactive epitope-containing region (residues region 166-194) were conserved. 

Expression of cDNA encoding DslS I results in production of pre- 
15 pro-Derf I protein in E* £QlL a recombinant protein of greater solubility, stability 
and antigenicity than that of recombinant Derp I. Protein encoded by Dsi_f I 
cDNA has been expressed using a pGEX vector and has been shown by 
radioimmune assay to react with rabbit anti-P, farinae antibodies. The availability 
of high yields of soluble Dai I allergen and antigenic derivatives will facilitate 
20 the development of diagnostic and therapeutic agents and the mapping of B and T 
cell antigenic determinants. 

With the availability of the complete amino acid sequence of 
recombinant Derf I- mapping of the epitopes recognized by both the B and T cell 
compartments of the immune system can be carried out. The use of techniques 
25 such as the screening of overlapping synthetic peptides, the use of monoclonal 

antibodies and gene fragmentation and expression should enable the identification 
of both the continuous and topographical epitopes of DsrJf I. It will be 
particularly useful to determine whether allergenic (IgE-binding) determinants 
have common features and are intrinsically different from antigenic (IgG-binding) 
30 determinants and whether T cells recognize unique epitopes different from those 
recognized by B cells. Studies to identify the Derf I epitopes reactive with mite 
allergic human IgE antibodies and the division of these into determinants cross- 
reactive with Derp I and determinants unique to Derf I can also be carried out. B 
cell (and T cell) epitopes specific for either species can be used to provide useful 
35 diagnostic reagents for determining reactivity to the different mite species, while 
cross-reacting epitopes are candidates for a common immunotherapeutic agent. 



BNSDOClD:<WO 9405790A1> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/05790 



PCT/US93/08518 



-11- 

As described in detail in the Examples, a cDNA clone coding for 
Usim I which contained a 0.8-kb cDNA insert has been isolated. Sequence 
analysis revealed that the 222 amino acid residue mature recombinant Derp I 
protein showed significant homology with a group of cysteine proteases, including 
actinidin, papain, cathepsin H and cathepsin B. 

Iso l ation and Sequence Analysis of Per f TJ 

A cDNA clone coding for DsrJ* II, a major allergen from the house 
dust mite P, farinae, has been isolated and sequenced, as described in the 
Examples. The nucleotide sequence and the predicted amino acid sequence of the 
P_er_f II cDNA are represented in Figure 14. A restriction map of the cDNA insert 
of a clone coding for Per f IT is represented in Figure 15. 

Figures 16A, 16B, and 16C show the alignment of Dsxf II and Derp 
II cDNA sequences. The homology of the sequence of UslS II with Derp II 
(88%) is higher than the 81% homology found with Derp I and D_gr_f I, which is 
significantly different (p<0.05) using the chi2 distribution. The reason for this 
may simply be that the Group I allergens are larger and each residue may be less 
critical for the structure and function of the molecule. It is known, for example, 
that assuming they adopt a similar conformation to other cysteine proteases, many 
of the amino acid differences in Derp I and P_£r_f I lie in residues linking the two 
domain structures of the molecules. The 6 cysteine molecules are conserved 
between the group II allergens, suggesting a similar disulphide bonding, although 
this may be expected, given the high overall homology. Another indication of the 
conservation of these proteins is that 34/55 of the nucleotide changes of the 
coding sequence are in the third base of a codon, which usually does not change 
the amino acid. Residues that may be of importance in the function of the 
molecule are Ser 57 where all three bases are changed but the amino acid is 
conserved. A similar phenomenon exists at residue 88, where a complete codon 
change has conserved a small aliphatic residue. Again, like Derp II, the D_£r_f II 
cDNA clone does not have a poly A tail, although the 3' non-coding region is rich 
in adenosine and has two possible polyadenylation signals ATAA. The 
nucleotides encoding the first four residues are from the PCR primer which was 
designed from the known homology of Derp II and UslIII from N-terminal 
amino acid sequencing. A primer based on the C-terminal sequence can now be 
used to determine these bases, as well as the signal sequence. 
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Uses of thp subject aM Tfr"^ proteins/nentides and DNA encoding same 

The materials resulting from the work described herein, as well as 
compositions containing these materials, can be used in methods of diagnosing, 
treating and preventing allergic responses to mite allergens, particularly to mites 

5 g-"»« rvrmatn phapoides . such as DJaiinae and D, pteronyssinus . In 

addition, the cDNA (or the mRNA from which it was transcribed) can be used to 
identify other similar sequences. This can be carried out, for example, under 
conditions of low stringency and those sequences having sufficient homology 
(generally greater than 40%) can be selected for further assessment using the 

10 method described herein. Alternatively, high stringency conditions can be used. 
In this manner, DNA of the present invention can be used to identify sequences 
coding for mite allergens having amino acid sequences similar to that of D_er_f I, 
D_ej_f II, Derp I or Derp II. Thus, the present invention includes not only 
P. farinae and n pteronvssinus allergens, but other mite allergens as well (e.g., 

15 other mite allergens encoded by DNA which hybridizes to DNA of the present 
invention). 

Proteins or peptides encoded by the cDNA of the present invention 
can be used, for example, as "purified" allergens. Such purified allergens are 
useful in the standardization of allergen extracts or preparations which can be 

20 used as reagents for the diagnosis and treatment of allergy to house dust mites. 
Through use of the peptides of the present invention, allergen preparations of 
consistent, well-defined composition and biological activity can be made and 
administered for therapeutic purposes (e.g., to modify the allergic response of a 
house dust mite-sensitive individual). Dsr_f I or D_sr_f II peptides or proteins (or 

25 modified versions thereof, such as are described below) may, for example, modify 
B-cell response to D_£T_f I or D_ej_f II, T-cell response to Det_f I and D_£r_f II, or 
both responses. Similarly, Dsl* I or Derp II proteins or peptides may be used to 
modify B-cell and/or T-cell response to Derp I or Derp H. Purified allergens can 
also be used to study the mechanism of immunotherapy of allergy to house dust 

30 mites, particularly to Per f I, Per f II, Derp I and Derp II, and to design modified 
derivatives or analogues which are more useful in immunotherapy than are the 
unmodified ("naturally-occurring") peptides. 

In those instances in which there are epitopes which are cross- 
reactive, such as the three epitopes described herein which are shared by D_£i_f I 

35 and Derp I, the area(s) of the molecule which contain the cross-reactive epitopes 
can be used as common immunotherapeutic peptides to be administered in treating 
allergy to the two (or more) mite species which share the epitope. For example, 
the cross-reactive epitopes could be used to induce IgG blocking antibody against 
both allergens (e.g., DerJT and Derp I allergen). A peptide containing a 
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univalent antibody epitope can be used, rather than the entire molecule, and may 
prove advantageous because the univalent antibody epitope cannot crosslink mast 
cells and cause adverse reactions during desensitizing treatments. It is also 
possible to attach a B cell epitope to a carrier molecule to direct T cell control of 
5 allergic responses. 

Alternatively, it may be desirable or necessary to have peptides which 
are specific to a selected Perrpatpphagpiri^ allergen. As described herein two 
epitopes which are apparently Dsld 1-specific have been identified A similar 
approach can be used to identify other species-specific epitopes (e.g., fiex* I or II 
' DfiLf I or II). The presence in an individual of antibodies to the species-specific 
epitopes can be used as a quick serological test to determine which mite species is 
causing the allergic response. This would make it possible to specifically target 
therapy provided to an individual to the causative species and, thus, enhance the 
therapeutic effect. 

Work by others has shown that high doses of allergens generally 
produce the best results (i.e., best symptom relief). However, many people are 
unable to tolerate large doses of allergens because of allergic reactions to the 
allergens. Modification of naturally-occurring allergens can be designed in such a 
manner that modified peptides or modified allergens which have the same or 
enhanced therapeutic properties as the corresponding naturally-occurring allergen 
but have reduced side effects (especially anaphylactic reactions) can be produced 
These can be, for example, a peptide of the present invention (e.g., one having all 
or a portion of the amino acid sequence of Usl£1 or Der_f II, Dsld I or DfiLp II) 
Alternatively, a combination of peptides can be administered. A modified peptide 
or peptide analogue (e.g., a peptide in which the amino acid sequence has been 
altered to modify immunogenicity and/or reduce allergenicity or to which a 
component has been added for the same purpose) can be used for desensitization 
therapy. 

Administration of the peptides of the present invention to an 
individual to be desensitized can be carried out using known techniques A 
peptide or combination of different peptides can be administered to an individual 
m a composition which includes, for example, an appropriate buffer, a carrier 
and/or an adjuvant. Such compositions will generally be administered by 
injection, inhalation, transdermal application or rectal administration. Using the 
information now available, it is possible to design a Dsr_p I, DfiLp. II, Denf I or 
DfiLf II peptide which, when administered to a sensitive individual in sufficient 
quantities, will modify the individual's allergic response to Dsr^ I, Usm II, Der_f 
I and/or DfiLf II. This can be done, for example, by examining the structure's of 
these allergens, producing peptides to be examined for their ability to influence B- 
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cell and/or T-cell responses in house dust mite-sensitive individuals and selecting 
appropriate epitopes recognized by the cells. Synthetic amino acid sequences 
which mimic those of the epitopes and which are capable of down regulating 
allergic response to Der_p I, DfiLP II, DgLfl or DfiLf II allergens can be made. 

5 Proteins, peptides or antibodies of the present invention can also be used, in 

known methods, for detecting and diagnosing allergic response to DfiLf I or DfiLf 
II. For example, this can be done by combining blood obtained from an 
individual to be assessed for sensitivity to one of these allergens with an isolated 
allergenic peptide of house dust mite, under conditions appropriate for binding of 

10 or stimulating components (e.g., antibodies, T cells, B cells) in the blood with the 
peptide and determining the extent to which such binding occurs. DslI and Usl& 
proteins or peptides can be administered together to treat an individual sensitive to 

both allergen types. 

It is now also possible to design an agent or a drug capable of 

15 blocking or inhibiting the ability of DfiLP. 1, DfiLP. II, DfiLf I or DfiLf II to induce 
an allergic reaction in house dust mite-sensitive individuals. Such agents could be 
designed, for example, in such a manner that they would bind to relevant anti- 
Derp I, anti-DfiLP. II, anti-Der_f I or anti-DsLf II IgEs, thus preventing IgE- 
allergen binding and subsequent mast cell degranulation. Alternatively, such 

20 agents could bind to cellular components of the immune system, resulting in 

suppression or desensitization of the allergic response to these allergens. A non- 
restrictive example of this is the use of appropriate B- and T-cell epitope peptides, 
or modifications thereof, based on the cDNA/protein structures of the present 
invention to suppress the allergic response to these allergens. This can be carried 

25 out by defining the structures of B- and T-cell epitope peptides which affect B- 
and T-cell function in in YittQ studies with blood cells from house dust mite- 
sensitive individuals. 

The cDNA encoding Derp I, Derp II, DfiLf I or DfiLf II or a peptide 
including at least one epitope thereof can be used to produce additional peptides, 

30 using known techniques such as gene cloning. A method of producing a protein 
or a peptide of the present invention can include, for example, culturing a host cell 
containing an expression vector which, in turn, contains DNA encoding all or a 
portion of a selected allergenic protein or peptide (e.g., DfiLP. I, DfiLP n, DfiLf I, 
DfiLf II or a peptide including at least one epitope). Cells are cultured under 
35 conditions appropriate for expression of the DNA insert (production of the 
encoded protein or peptide). The expressed product is then recovered, using 
known techniques. Alternatively, the allergen or portion thereof can be 
synthesized using known mechanical or chemical techniques. As used herein, the 
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term protein or peptide refers to proteins or peptides made by any of these 
techniques. The resulting peptide can, in turn, be used as described previously. 

DNA to be used in any embodiment of this invention can be cDNA 
obtained as described herein or, alternatively, can be any oligodeoxynucleotide 
sequence having all or a portion of the sequence represented in Figures 1A and IB, 
7A and 7B, 10A and 10B, and 14 or their functional equivalent. Such 
oligodeoxynucleotide sequences can be produced chemically or mechanically, 
using known techniques. A functional equivalent of an oligonucleotide sequence 
is one which is capable of hybridizing to a complementary oligonucleotide 
sequence to which the sequence (or corresponding sequence portions) of Figures 
1 A and IB, 7A and 7B, 10A and 10B, and 14 hybridizes and/or which encodes a 
product (e.g., a polypeptide or peptide) having the same functional characteristics 
of the product encoded by the sequence (or corresponding sequence portion) 
represented in these figures. Whether a functional equivalent must meet one or 
both criteria will depend on its use (e.g., if it is to be used only as an oligoprobe, it 
need meet only the first criterion and if it is to be used to produce house dust mite 
allergen, it need only meet the second criterion). 

The structural information now available (e.g., DNA, protein/peptide 
sequences) can also be used to identify or define T cell epitope peptides and/or B 
cell epitope peptides which are of importance in allergic reactions to house dust 
mite allergens and to elucidate the mediators or mechanisms (e.g., interleukin-2, 
interleukin-4, gamma interferon) by which these reactions occur. This knowledge 
should make it possible to design peptide-based house dust mite therapeutic agents 
or drugs which can be used to modulate these responses. 

The present invention will now be further illustrated by the following 
Examples, which are not intended to be limiting in any way. 



EXAMPLE 1 

MATERIALS AMD METHODS 

30 Cloning and Expression of Per p T cDNA 

Polyadenylated mRNA was isolated from the mite 
Dermatophagpides pteronyssinns cultured by Commonwealth Serum Laboratories, 
Parkville, Australia, and cDNA was synthesized by the RNA-ase H method (5) 
using a kit (Amersham, International, Bucks). After the addition of EcoRI linkers 

35 the cDNA was ligated into ggt 1 1 and plated in coii Y 1 090 (r-) (Promega Biotec, 
Madison, Wisconsin), to produce a library of 5xK)5 recombinants. Screening was ' 
performed by plaque radioimmune assay (6) using a rabbit anti-Dfir_p I antiserum 
(7). Reactivity was detected by hydrochloride in 0.1 
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M sodium acetate buffer pH 5.2 were then added and the mixture was 
homogenized and spun at 10,000 rpm for 30 min in a Sorval SS34 rotor. The 
supernatant was collected and layered onto a CsCl pad (5ml of 4.8 M CsCl in 10 
mM EDTA) and centrifiiged at 37,000 rpm for 16h at 15°C in a SW41 TI rotor 
5 (Beckman Instruments, Inc., Fullerton, CA). The DNA band at the interphase was 
collected and diluted 1:15 in lOmM Tris HC1/1 mM EDTA buffer, pH 8.0. 
Banding of genomic DNA in CsCl was carried out by the standard method. 

Tsnlatinn of DNA from PPtl 1 pi CPNA Clone, 

10 Phage DNA from ggtl 1 pi clone was prepared by a rapid isolation 

procedure. Clarified phage plate lysate (1 ml) was mixed with 270ml of 25% 
wt/vol polyethylene glycol (PEG 6000) in 2.5 MNaCl and incubated at room 
temperature for 15 min. The mixture was then spun for 5 min in a microfuge 
(Eppendorf, Federal Republic of Germany), and the supernatant was removed. 

15 The pellet was dissolved in 100 ml of 10 mM Tris/HCl pH 8.0 containing 1 mM 
EDTA and 100 mM NaCl. This DNA preparation was extracted 3 times with 
phenol/chloroform (1:1) and the DNA was precipitated by ethanol. 

DNA Hybridization. 

20 Nucleic acid was radiolabeled with 32 P by nick translation (10). 

DNA samples were digested with appropriate restriction enzymes using 
conditions recommended by the supplier. Southern blots were prepared using 
Zeta-Probe membranes (Bio-Rad Laboratories, Richmond, CA). 
Prehybridization, hybridization, posthybridization washes were carried out 

25 according to the manufacturers recommendations (bulletin 1234, Bio-Rad 
Laboratories). 

Clnnmg and DNA Seguencine 

To clone the 0.8-kb cDNA insert from clone ggtl 1 pi into plasmid 
30 pUC8, phage DNA was digested with EcoRI restriction enzyme and then ligated 
to EcoRI-digested pUC8 DNA and used to transform Escherichia coli JM83. The 
resulting recombinant plasmid was designated as pHDM 1 . 

To obtain clones for DNA sequence analysis, the cDNA insert was 
isolated from pHDM 1 and ligated to M13-derived sequencing vectors mpl8 and 
35 mp 1 9 ( 1 6). Transformation was carried out using IL soli JM1 07 and sequencing 
was performed by the dideoxynucleotide chain termination method (11). 



BNSDOCID:<WO 940S790A1> 



WO 94/05790 



PCT/US93/08S18 



10 



15 



20 



25 



30 



35 



-17- 

RESIIT.TS 

Several phage clones reacted with the rabbit ami Derp I serum and 
hybridized with all 3 oligonucleotide probes. One of these, ggtl 1 pl(13T), was 
examined further. The nucleotide sequence of the cDNA insert from this clone, 
ggtl 1 pi, was determined using the sequencing strategy shown in Fig. 2. The 
complete sequence was shown to be 857 bases long and included a 69-base-long 5' 
proximal end sequence, a coding region for the entire native Derp I protein of 222 
amino acids with a derived molecular weight of 25,371, an 89-base-long 3' 
noncoding region and a poly (A) tail of 33 residues (Figures 1A and IB). 

The assignment of a threonine residue at position 1 as the NH2- 
terminal amino acid of Per p I was based on data obtained by NH2-terminal amino 
acid sequencing of the pure protein isolated from mite excretions (17). The 
predicted amino acid sequence matched with data obtained by amino acid sequence 
analysis of the NH 2 -terminal region as well as with internal sequences derived from 
analyses of tryptic peptides (Figures 1 A and IB). The complete mature protein is 
coded by a single open reading frame terminating at the TAA stop codon at 
nucleotide position 736-738. At present, it is not certain whether the first ATG 
codon at nucleotide position 16-1 8 is the translation initiation site, since the 
immediate flanking sequence of this ATG codon (TTGATGA) showed no 
homology with the Kozak consenses sequence (ACCATGG) for the eukaryotic 
translation initiation sites (18). In addition, the 5' proximal end sequence does not 
code for a typical signal peptide sequence (see below). 

The amino acid sequence predicted by nucleotide analysis is shown in 
Figures 1 A and IB. A protein data-base search revealed that the Derp I amino 
acid sequence showed homology with a group of cysteine proteases. Previous 
cDNA studies have shown that lysosomal cathepsins B, a mouse macrophage 
protease and a cysteine protease from an amoeba have transient pre- and preform 
intermediates (19-21), and inspection of the amino acid sequence at the 5' proximal 
end of the ggtl 1 pi cDNA clone suggests that Derp I may be similar. First, the 
hydrophilicity plot (22) of the sequence preceding the mature protein sequence 
lacks the characteristic hydrophobic region of a signal peptide (23) and second, an 
Ala-X-Ala sequence, the most frequent sequence preceding the signal peptidase 
cleavage site (24,25), is present at positions -13, -14, -15 (Figures 1A and IB). 
Therefore, it is proposed that cleavage between pro- Der p I sequence and the pre- 
Dslp I sequence occurs between Ala (-13) and Phe (-12). Thus, pro-Der p I 
sequence begins at residues Phe (-12) and ends at residues Glu (-1). The amino 
acids residues numbered -13 to -23 would then correspond to a partial signal 
peptide sequence. The full length of the Derp I preproenzyme sequence has been 
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determined and is shown in Figures 21A and 21B. The negative sequence 
numbers refer to the pre- and preproenzyme forms of Q£r_p. I. 

When the 857-bp cDNA insert was radiolabeled and hybridized 
against a Southern blot of EcoRI-digested genomic DNA from house dust mite, 
5 hybridization to bands of 1 .5, 0.5, and 0.35 kb was observed (data not shown). As 
shown in the restriction enzyme map of the cDNA insert (Figure 2), there was no 
internal EcoRI site and the multiple hybridization bands observed suggest that 
Derp I is coded by a noncontiguous gene. The results also showed little evidence 
of gene duplication since hybridization was restricted to fragments with a total 

10 length of 2.4 kb. 

The N-terminal can be compared with N-terminal of the equivalent 
protein from n.farinae (Dsr_f I) (12). There is identity in 1 1/20 positions of the 
sequences available for comparison (Fig. 3). 

To examine the protein produced by ggtl 1 pl(13T), phage was 

15 lysogenized in Y1089 (r-) and the bacteria grown in broth culture at 30°C. Phage 
was induced by temperature switch and isopropyl thiogalactopyranoside (IPTG) 
(6) and the bacteria were suspended in PBS to 1/20 of the culture volume, and 
sonicated for an antigen preparation. When examined by 7.5% SDS-PAGE 
electrophoresis it was found that ggtl 1 pl(13T) did not produce a Mr 1 16K B- 

20 galactosidase band but instead produced a 140K band consistent with a fusion 
protein with the Derp I contributing a 24kDa moiety (6). Rabbit anti Dam I was 
shown to react with the lysate from ggtl 1 pl(13T) (Fig. 4). 



F.XAMPLE 2 

Fv pr^inn nf Per p I cONA prod u cts reactive with leE from allergic serum - 
The DNA insert from ggtl 1 pl(13T) which codes for Dsr_p. I was 
subcloned into the EcoRI site of the plasmid expression vector (pGEX)(26) where 
it could be expressed as a fusion with a glutathione transferase molecule. E, coJi 
infected with this plasmid pGEX-pl(13T) or with the vector alone were grown to a 
log phase culture and harvested by centrifugation. The bacteria were suspended in 
PBS to 1/20 of their culture volume and lysed by freeze- thawing. The lysate was 
shown by sodium dodecyl- sulphate polyacrylamide electrophoresis to express a 
fusion protein in high concentration of the expected Mr 50,000. These lysates 
were then tested for their ability to react with IgE from allergic serum by 
radioimmune dot-blot conducted by the method described by Thomas and Rossi 
(27). The serum was taken from donors known to be mite-allergic or from non- 
allergic controls. Reactivity was developed by 125i_ m0 noclonal anti-IgE and 
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autoradiography. Figure 5 shows the lysate from pGEX-pl(13T), but not the 
vector control reacted with IgE in allergic serum, but not non allergic serum. 



Inhibition of TpF antibody r esponses to Per p T hv 

treatment with the product from a r. DNA clone 
coding for Per p T 
IL £oli lysogenized by ggtl 1 pl(13T) were grown and induced by 
10 temperature switch to produce a recombinant fusion protein which was consistent 
with a 24 kD Perp I moiety and a 1 16 kP 6-galactosidase moiety (pl(13T) (28). 
This protein was mostly insoluble and could be isolated to about 90% purity, 
judged by sodium didodecyl polyacrylamide electrophoresis, by differential 
centrifugation. A similar protein was produced from another gtl 1 cDNA mite 
clone ggt pX (2c). To test for the ability of the recombinant protein to modify IgE 
antibody responses to Perp I, groups of 4-5 CBA mice were injected 
intraperitoneally with 2 mg of the pl(13T) or pX (2c) fusion proteins and after 2 
days given a subcutaneous injection of 5mg of native Perp I (from mite culture 
medium) in aluminium hydroxide gel. The IgE antibody titres were measured by 
passive cutaneous anaphylaxis (PCA) after 3 and 6 weeks. The methods and 
background data for these responses have been described by Stewart and Holt 
(29). For a specificity control, groups of mice injected with pl(13T) or pX (2c) 
were also injected with lOmg of ovalbumin in alum. Responses were compared to 
mice without prior p 1(1 3T) or pX (2c) treatment (Table 1). After 3 weeks mice 
either not given an injection of recombinant protein or injected with the control 
pX (2c) had detectable anti Perp I PCA titres (1/2 or greater). Only 1/5 of mice 
treated with recombinant pl(13T) had a detectable titre and this at 1/4 was lower 
than all of the titres of both control groups. Titres of all groups at 6 weeks were 
low or absent (not shown). The PCA response to ovalbumin was not significantly 
affected by treatment with recombinant proteins. These data show the potential of 
the recombinant proteins to specifically decrease IgE responses as required for a 
desensitizing agent. 
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TABLE 1 Inhibition of anti- Der p I IgE by preinjection with with recombinant 
Per p I. 

preinjection immunizing TgF (PC A) titres at d21 
5 group -2 days injection (dO) 

(5mg/alum) responders titres 

~\ . DSLpJ 4/4 1/16-1/64 
2 pX(2C) Qer_pl 5/5 1/8-1/16 
io 3 pl(13T) DfiTjjI 1/5* 1/4* 



4 - ovalbumin 4/4 1/64-1/256 

5 pX(2C) ovalbumin 5/5 1/32-1/128 

6 pl(13T) ovalbumin 5/5 1/64-1/256 



Mice were given a preinjection on day -2 and then immunized with Derp I or 
ovalbumin on day 0. Serum antibody titres were measured on day 21 and 42 by 
PCA in rat skin. Significant anti-DfiLp I titres were not detected on day 42 (not 
20 shown). The PCA were measured to Derp I for groups 1-3 and ovalbumin for 
groups 4-6. The anti-DSLP I titres were lower (pO.OOl)* when pretreated with 
recombinant Derp I p 1 ( 1 3T). 

*Mann Whitney analysis. 

25 

EXAMPLE 4 

Ex pression of Pe r p T antiffem'r. determinants bv 
fra gments of the c DNA from PPtll nlf 13T) 

30 The cDN A from ggt 1 1 ( 1 3T) coding for Derp I was fragmented by 

sonication. The fragments (in varying size ranges) were isolated by 
electrophoresis, filled in by the Klenow reaction to create blunt ends. EcqRI 
linkers were attached and the fragment libraries cloned in ggtl 1. The methods 
used for the fragments cloning were the same as that used for cDNA cloning (6). 

35 Plaque immunoassay was used for screening with rabbit anti-Dsr_p I. Three phage 
clones reacting with the antiserum were isolated and the oligonucleotide sequences 
of the cloned fragments obtained. Two of these were found to code for D_£T_p I 
amino acids 17-55 (see Figures 1 A and IB for numbering) and one for amino acids 
70-100. Such fragments will eventually be useful for both diagnostic reagents to 
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determine epitope reactivity and for therapy where molecules of limited 
allergenicity may increase safety of desensitisation. 

EXAMPLE 5 

5 

Cloning and expression of cDNA coding for the major mite allergen Per p TT 

The Dermatophagoides pteronyssinus cDNA library in ggtl 1 
previously described was screened by plaque radioimmune assay using 
nitrocellulose lifts (6). Instead of using specific antisera the sera used was from a 

io person allergic to house dust mites. The serum (at 1/2 dilution) was absorbed with 
E. coli . To detect reactivity an 125j labelled monoclonal anti-IgE was used (at 
30ng/ml with 2xl0 6 cpm/ml (approx. 30% counting efficiency)). After 1 hour the 
filters were washed and autoradiography performed. Using this procedure 4 
clones reacting with human IgE were isolated. It was found they were related by 

15 DNA hybridization and had an identical pattern of reactivity against a panel of 
allergic sera. Fig. 6 shows IgE reactivity in plaque radioimmunoassay against 
allergic serum (AM) (top row) or non allergic (WT). Here, clones 1, 3 and 8 react 
strongly, but only against allergic sera. The amp 1 segments (present in row 1) 
are a ggtl 1 vector control. The bottom row is an immunoassay with rabbit anti- 

20 Per p I, developed by 125j staphylococcus protein A which shows no significant 
reactivity. The clones were tested against a panel of sera. Serum from five 
patients without allergy to mite did not react, but serum from 14/17 people with 
mite allergy showed reactivity. The DNA insert from the clone ggtl 1 pII(CI) was 
subcloned into Ml 3 mpl8 and Ml 3 mpl9 and sequenced by the chain termination 

25 method. The nucleotide sequence (Figures 7A and 7B) showed this allergen was 
Per p II by (a) the homology of the inferred amino acid sequence of residues 1-40 
with that of the N-terminal amino acid of Per p II (30); and (b) the homology of 
this sequence with the equivalent Per f II allergen from Dermatophagoides farinae 
(30). 

30 

EXAMPLE 6 

Isolation and C haracterization of cPNA Coding for Per f I 

MATERIALS AND METHQPS 
35 Dermatophagoides farinae culture 

Mites were purchased from Commonwealth Serum Laboratories, 
Parkville, Australia. 
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Construction of the D. farinae cDNA ggtl 1 library 

Polyadenylated mRNA was isolated from live D. farinae mites and 
cDNA was synthesized by the RNase H method (Gubler, V. and B J. Hoffman, 
Qsnsi 21:263-269 (1983)) using a kit (Amersham International, Bucks.)- After the 
5 addition of EcoRI linkers (New England Biolabs, Beverly, MA) the cDNA was 
ligated to alkaline phosphatase treated ggtl 1 arms (Promega, Madison, WI). The 
ligated DNA was packaged and plated in E. coli Y1090 (r-) to produce a library of 
2x 1 recombinants. 

10 Isolation of Per f I cDN A clones from the D. farinae cDNA ggtl 1 library 

Screening of the library was performed by hybridization with two 
probes comprising the two Per p I cPNA BamHI fragments 1-348 and 349-857 
generated by BamHI digestion of a derivative of the Per p I cPNA which has had 
two BamHI restriction sites inserted between amino acid residues -1 and 1 and 

15 between residues 116 and 1 1 7 by site-directed mutagenesis (Chua, K.Y. et al» 
J. Exp. Med. 167 : 1 75- 1 82 ( 1 988)). The probes were radiolabeled with 32 P by 
nick translation. Phage were plated at 20,000 pfu per 150mm petri dish and 
plaques were lifted onto nitrocellulose (Schleicher and Schull, Passel, FRG), 
denatured and baked (Maniatis, T. sLaL, Molecular Cloning: Laboratory anuaL 

20 Cold Spring Harbor Laboratory Press (1982)). Prehybridizations were performed 
for 2 hours at 42°C in 50% formamide/5 x SSCE/1 x Penhardt's/poly C 
(0.1mg/ml)/poly U(0.1mg/ml) with hybridization overnight at 42°C at 10 6 
cpm/ml. Post hybridization washes consisted of 15 min washes at room 
temperature with 2 x sodium chloride citrate (SSC)/0.1% sodium dodecylsulphate 

25 (SPS), 0.5 x SSC/0.1% SPS, 0.1 x SSC/0.1% SDS successively and a final wash 
at 50°C for 30 min in 0.1 x SSC/1% SPS. 

Isolation of DNA from ggtl 1 f 1 cDNA clones - 

Phage PNA from ggtl 1 f 1 clones was prepared by a rapid isolation 

30 procedure. Clarified phage plate lysate (1 ml) was mixed with 270 of 25% wt/vol 
polyethylene glycol (PEG 6000) in 2.5M NaCl and incubated at room temperature 
for 1 5 min. The mixture was then spun for 5 min in a microfuge (Eppendorf, 
FRG), and the supernatant was removed. The pellet was dissolved in 100 mL of 
lOmM Tris/HCl pH8.0 containing 1 mM EPTA and 100 mM NaCl (TE). This 

35 PNA preparation was extracted with phenolyTE, the phenol phase was washed 
with 100 mP TE, the pooled aqueous phases were then extracted another 2 times 
with phenol/TE, 2 times with Leder phenol (phenol/chloroform/isoamylalcohol; 
25:24:1), once with chloroform and the PNA was precipitated by ethanol. 
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DNA sequencing 

To obtain clones for DNA sequence analysis, the ggtl 1 fl phage 
DNA was digested with EcoRI restriction enzyme (Pharmacia, Uppsala, Sweden) 
and the DNA insert was ligated to EcoRI-digested M13-derived sequencing 
vectors m P 18 and mpl9 (Maniatis, T. ej_aL, Molecular Tinning- a 
Laboratory Manual, Cold Spring Harbor Laboratory Press (1982)). 
Transformation was carried out using E^coii TG-1 and sequencing was performed 
by the dideoxynucleotide chain termination method (Sanger, F. et al. . 
Proc, Natl, Acad, Sci, USA, 24:5463-5467 (1977)) using the Sequenase version 
2.0 DNA sequencing kit (U.S.B., Cleveland, Ohio). 

Polymerase chain rear. tion (PCR-) 

PCR was performed by the Taq DNA polymerase method (Saiki, 
R K. ei_aL Science 222:487-491 (1988)) using the TaqPaq kit (Biotech 
International, Bentley, WA) and the conditions recommended by the supplier with 
lOng of target DNA and lOpmol of ggtl 1 primers (New England BioLabs, 
Beverly, MA). 



RESULTS 

Isolation of Per f T cDNA Hn P ^ 

Two clones expressing the major mite allergen Per f T were isolated 
from the D, farinae cDNA ggtl 1 library by their ability to hybridize with both of 
the QejLp I cDNA probes (nucleotides 1-348 and 349-857). This approach was 
adopted because amino acid sequencing had shown high homology (80%) 
between these two allergens (Thomas, W.R., £LaL, Advances in the Biosrienres 
14:139-147 (1989)). Digestion of the ggtl 1 fl clone DNA with EcoRI restriction 
enzyme to release the cDNA insert produced three D_sr_f I cDNA EcoRI 
fragments: one approximately 800 bases long and a doublet approximately 150 
bases long. The D_sr_f I cPNA insert was also amplified from the phage DNA by 
the polymerase chain reaction (PCR) resulting in a PCR product of approximately 
1.1 -kb. Each Per f I cDNA fragment was cloned separately into the M13-derived 
sequencing vectors mpl 8 and mpl9 and sequenced. 

DNA sequence analysis 

The nucleotide sequence of D_£r_f I cDNA was determined using the 
sequencing strategy shown in Figure 9. The complete sequence was shown to be 
1084 bases long and included a 335-base long 5' proximal end sequence, a coding 
region for the entire native Der_f I protein of 223 amino acids with a derived 
molecular weight of 25,191 and an 80-base long 3' noncoding region (Fig. 10). 
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The assignment of the threonine residue at position 1 as the NH2-terminal amino acid 
of Per f I was based on data obtained by Nth-terminal amino acid sequencing of the 
native protein and the predicted amino acid sequence of recombinant Dsm I (Chua, 
K.Y. et aL -T. Fxp. Med. . 167:175-182 (1988)). The predicted amino acid sequence of 

5 the Per f l cDNA in the Nth-terminal region matched completely with that 
determined at the protein level (Figures 10A and 10B). 

The complete mature protein coded by a single open reading frame 
terminating at the TGA stop codon at nucleotide position 42-44 is presumed to be the 
translation initiation site since the subsequent sequence codes for a typical signal 

10 peptide sequence. 

Amino Acid Se quence Analysis 

The amino acid sequence of Denf I predicted by nucleotide analysis is 
shown in Figures 10A and 10B. As shown in the composite alignment of the amino 

15 acid sequence of mature Perp I and Per f l (Figure 1 1), high homology was observed 
between the two proteins. Sequence homology analysis revealed that the Pgr f I 
protein showed 81% homology with the Perp I protein as predicted by previous 
conventional amino acid sequencing. In particular, the residues making up the active 
side of Per p I, based on those determined for papain, actinidin, cathepsin H, and 

20 cathepsin B, are also conserved in the Per f I protein. The residues are glutamine 

(residue 29), glycine, serine and cysteine (residues 33-35), histidine (residue 171) and 
asparagine, serine and tryptophan (residues 191-193) where the numbering refers to 
Per f I. The predicted mature Per f I amino acid sequence contains a potential N- 
glycosylation site (Asn-Thr-Ser) at position 53-55 which is also present as Asn-Gln- 

25 Ser at the equivalent position in Per p I. 

Analysis of the predicted amino acid sequence of the entire UslI I cPNA 
insert has shown that, as for other cysteine proteases (Figures 12A and 12B), the UslI 
I protein has pre- and proform intermediates. As previously mentioned, the 
methionine residue at position -98 is presumed to be the initiation methionine. This 

30 assumption is based on the fact that firstly, the 5* proximal end sequence from residues 
-98 to -81 is composed predominantly of hydrophobic amino acid residues (72%), 
which is the characteristic feature of signal peptides (Von Heijne, G., EMBQ J., 
1:2315-2323 (1984)). Secondly, the lengths of the presumptive pre- (18 amino acid 
residues) and pro-peptides (80 residues) are similar to those for other cysteine 

35 proteases (Figures 12A and 12B). Most cysteine proteases examined have about 120 
preproenzyme residues (of which an average of 19 residues form the signal peptide) 
with cathepsin B the smallest with 80 (Ishidoh, K. sLaL FEBS Letters, 226:32-37 
(1987)). Per f I falls within this range with a total of 98 preproenzyme residues. 

Si IRfyPT) *T? CUiZCT 'P« if Z 0£\ 
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By following the method for predicting signal-sequence cleavage 
sites outlined in Von Heijne, it is proposed that cleavage from the pre- Der f T 
sequence for proenzyme formation occurs at the signal peptidase cleavage site 
lying between Ala (-8 1 ) and Arg (-80) (Von Heijne. G., Eur. J. n\nrhr™ jjj- j 7 . 
5 21 (1988) and J, Mol. Pio| , 1*4:99-105 (1985)). Thus, the sequence from 
residues -98 to -81 codes for the leader peptide while the proenzyme moiety of 
P_er_f I begins at residue Arg (-80) and ends at residue Glu (-1). 

EXAMPT.F. 7 

» Isolation and Characterization of cDNA Co ding for i> r f y y 
MATERIALS AMD MFTH n P g 
Amino acAri f^i^ ce analysis 

Preparation of ml 1 D faring p.hna }\^us 

farinae was purchased from Commonwealth Serum Laboratories, 
Parkville, Australia, and used to prepare mRNA (polyadenylated RNA) as 
described (Stewart, G.A. and W.R. Thomas, Int. Arch AH, r A npll mm ., rl 
£1:384-389 (1987)). The mRNA was suspended at approximately 0.5mg/ml and 
5mg used to prepare cDNA by the RNase H method (Gubler, U. and Hoffinan 
B.J., Gens, 25:263-269 (1983)) using a kit (Amersham International, Bucks) ' 
EcoRI linkers (Amersham, GGAATTCC) were attached according to the method 
described by Huynh fiLaL, Constructing and screening cDNA libraries in gtlO and 
gtl 1, In: Glover, DNA Cloning vol. A practical approach pp. 47-78 IRL Press, 
Oxford (1985)). The DNA was then digested with EcoRI and recovered from an 
agarose gel purification by electrophoresis into a DEAE membrane (Schleicher 
and Schuell, Dassel, FRG, NA-45) according to protocol 6.24 of Sambrook et al 
(Sambrook fiLaL, Molecular Cloning: A Laboratory Manual, 2d Ed., Cold Spring 
Harbor Laboratory Press (1989)), except 0.5M arginine base was used for elution 
The cDNA was then ligated in ggtlO and ggtl 1 at an arms to insert ratio of 2:1. 
Some was packaged for plaque libraries and an aliquot retained for isolating 
sequences by polymerase chain reaction as described below. 

Isolation of Per f TT cDNA hv Polymery rh ^ n F ^r ti - n 

To isolate Qsr_f II cDNA, an oligonucleotide primer based on the N- 
terminal sequence of Denp. II was made because their amino acid residues are 
identical in these regions (Heymann, P. W. s^L, J. Allerg y riin T^ miInH 
£3_:1055-1087 (1989)). The primer GGATCCGATCAACTCGATGC-3' was 
used. The first GGATCC encodes a EamHl site and the following sequence 
GAT... encodes the first four residues of Per p II. For the other primer the ggtl 1 
TTGACACCAGACCAACTGGTAATG-3' reverse primer flanking the EcoRI 
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cloning site was used (New England Biolabs, Beverly, MA). The Derj> II primer 
was designed to have approximately 50-60% G-C and to end on the first or 
second, rather than the third, base of a codon (Gould, S.J. suL 
P f n r N^tl. Acad. Sci. . £6_- 1934-1938 (1989); Summer, R. and D. Tautz, 

5 MnolPir. Acid Res.. 12:6749 (1989)). 

The PCR reactions were carried out in a final reaction volume of 25 
ml containing 67mM Tris-HCL (pH8.8 at 25°C), 16.6mM (NH 4 )2S0 4 , 40mM 
dNTPs, 5mM 2-mercaptoethanol, 6mM EDTA, 0.2mg/ml gelatin, 2mM MgCl 2 , 
lOpmo'les of each primer and 2 units of Taq polymerase. Approximately O.OOlmg 

io of target DN A was added and the contents of the tube were mixed and overlay ed 
with paraffin oil. The tubes were initially denatured at 95°C for 6 minutes, then 
annealed at 55°C for 1 minute and extended at 72°C for 2 minutes. Thereafter for 
38 cycles, denaturing was carried out for 30 seconds and annealing and extension 
as before. In the final (40th) cycle, the extension reacton was increased to 10 

15 minutes to ensure that all amplified products were full length. The annealing 

temperature was deliberately set slightly lower than the Tm of the oligonucleotide 
primers (determined by the formula Tm=69.3 + 0.41 (G+C%)-650/oligo length) to 
allow for mismatches in the N-terminal primer. 

5ml of the reaction was then checked for amplified bands on a 1% 

20 agarose gel. The remainder of the reaction mixture was extracted with chloroform 
to remove all of the paraffin oil and ethanol precipitated prior to purification of 
the amplified product on a low melting point agarose gel (Bio-Rad, Richmond, 
CA). 



25 Snhcloninp of PCR Product 

The ends of the purified PCR product were filled in a reaction 
containing 10 mM Tris HC1, 10 mM MgCl 2 , 50 mM NaCl, 0.025 mM dNTP and 
lml of Klenow enzyme in a final volume of 100ml. The reaction was carried out 
at 37°C for 15 minutes and heat inactivated at 70°C for 10 minutes. The mixture 

30 was Leder phenol extracted before ethanol precipitation. The resulting blunt 
ended DNA was ligated into M13mpll8 digested with £ma I in a reaction 
containing 0.5M ATP, 1 X ligase buffer and 1 unit of T 4 ligase at 15°C for 24 hrs 
and transformed into E. coli TGI made competent by the CaCl2 method. The 
transformed cells were plated out as a lawn on L + G plates and grown overnight 

35 at 37°C. 
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Preparation of SinPle-stranded DN A Template for Sequencing 

Isolated white plaques were picked using an orange stick into 2.5 ml 

of an overnight culture of TGI cells diluted 1 in 100 in 2 X TY broth, and grown 

at 37°C for 6 hours. The cultures were pelleted and the supernatant removed to a 
> fresh tube. To a 1ml aliquot of this supernatant 270ml of 20% polyethylene 

glycol, 2.5M NaCl was added and the tube was vortexed before allowing it to 
stand at room temperature (RT) for 15 minutes. This was then spun down again 
and all traces of the supernatant were removed from the tube. The pellet was then 
resuspended in 100ml of 1 X TE buffer. At least 2 phenol:TE extractions were 
done, followed by 1 Leder phenol extraction and a CHCP3 extraction. The DNA 
was precipitated in ethanol and resuspended in a final volume of 20ml of TE 
buffer. 



DNA Analyses 

DNA sequencing was performed with the dideoxynucleotide chain 
termination (Sanger, F. sLaU Proc. Natl. Acad Sri,, 24:5463-5467 (1977)) using 
DNA produced from M13 derived vectors mpl8 and mpl9 in E. coli TGI and T4 
DNA polymerase (Sequenase version 2.0, USB Corp., Cleveland, Ohio; 
Restriction endonucleases were from Toyobo, (Osaka, Japan). All general 
procedures were by standard techniques (Sambrook, J. et al. . A Laboratory 
Manual, 2d Ed. Cold Spring Harbor Laboratory Press (1989)). The sequence 
analysis was performed using the Mac Vector Software (IBI, New Haven, CT). 

RESULTS 

D. farinae cDNA ligated in ggtl 1 was used to amplify a sequence 
using an oligonucleotide primer with homology to nucleotides coding for the 4 N- 
terminal residues of Per p II and a reverse primer for the ggtl 1 sequence flanking 
the coding site. Two major bands of about 500 bp and 300 bp were obtained 
when the product was gel electrophoresed. These were ligated into M13 mpl8 
and a number of clones containing the 500 bp fragment were analyzed by DNA 
sequencing. Three clones produced sequence data from the N-terminal primer 
end and one from the other orientation. Where the sequence data from the two 
directions overlapped, a complete match was found. One of the clones read from 
the N-terminal primer, contained a one-base deletion which shifted the reading 
frame. It was deduced to be a copying error, as the translated sequence from the 
other two clones matched the protein sequence for the first 20 amino acid residues 
of the allergen. 
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The sequence of the clones showing consensus and producing a correct 
reading frame is shown in Figure 14, along with the inferred amino acid sequence. It 
coded for a 129 residue protein with no N-glycosylation site and a calculated 
molecular weight of 14,02 1 kD. No homology was found when compared to other 

5 proteins on the GenBank data base (6 1 .0 release). It did, however, show 88% amino 
acid residue homology with UsiM II shown in the alignment in Figures 16A, 16B, and 
16C. Seven out of the 16 changes were conservative. The conserved residues also 
include all the cysteines present at positions 8, 21, 27, 73 and 1 19. There was also 
considerable nucleotide homology, although the restriction enzyme map generated 

10 from the sequence data for commonly used enzymes is different from Perp II (Figure 
15). The hydrophobicity plots of the translated sequence of P_sr_f II and Derp II 
shown in Figures 17A and 17B are almost identical. 

EXAMPLE 8 

15 

pR fprmination of Nucleotide S equence Polymorphisms in 
th* Her p T Per p TT and Per f II Allergens 

It was expected that there were sequence polymorphisms in the nucleic 
20 acid sequence coding for Derp I, Derp II, Perf I and Dsr_f II, due to natural allelic 
variation among individual mites. Several nucleotide and resulting amino acid 
sequence polymorphisms were discovered during the sequencing of different Per p I, 
Derp II and Per f II clones. The amino acid sequence polymorphisms are shown in 
Figures 18, 19 and 20. 
25 The original Derp I ggtl 1 cDNA library was reprobed with cDNA 

obtained from the ggtl 1 pl(13T) clone to identify new clones. Similarly, the ggtl 1 
cDNA library of Per p II was reprobed with cDNA obtained from the ggtl 1 pII(Cl) 
clone to identify additional Derp II clones. These clones were isolated, sequenced 
and found to contain nucleotide and resulting amino acid sequence polymorphisms 

30 (see Fig. 18 and 19). 

Four Derp I clones, (b), (c), (d) and (e) were sequenced, as shown in Fig. 
18. Clone Derp 1(d) was found to contain the following polymorphisms relative to 
the clone Derp 1(a) sequence: ( 1 ) the codon for amino acid residue 136 was ACC 
rather than AGC, which results in a predicted amino acid substitution of Thr for Ser; 

35 (2) the codon for amino acid residue 149 had a silent mutation, GCT rather than GCA; 
and (3) the codon for amino acid residue 2 1 5 was CAA rather than GAA; which 
results in a predicted amino acid substitution of Gin for Glu. 
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The Per p II clones, Per p 11(1) and Perp 11(2) were sequenced as 
shown in Figure 19. Clone Perp 11(2) was found to have the codon TCA, rather 
than ACA at amino acid residue 47, which results in a predicted amino acid 
substitution of Ser for Thr. This clone also was found to have the codon AAT at 
5 amino acid residue 1 13 rather than GAT, which results in a predicted amino acid 
substitution of Asn for Asp. The codon for amino acid 127 of this clone was 
found to be CTC rather than ATC. This change in codon 127 results in a 
predicted amino acid substitution of Leu for He. 

Additional UslI II cPNA clones containing nucleic acid and 
io resulting amino acid sequence polymorphisms were obtained from PCR reactions 
using cDNA prepared with RNA isolated from P. farinae mites (Commonwealth 
Serum Laboratories, Parksville, Australia). cPNA was prepared and ligated in 
ggtl 0 as previously described (Trudinger el aL (1991) Clin. F.xp Allp rg y 21:33- 
37). The clones described below were isolated following PCR of the ggtlO library 
15 using a 5' primer, which had the sequence 5-GGATCCGATCAAGTCGATGT-3 '. 
The nucleotides 5'-GGATCC-3' of the 5' primer correspond to a Bam HI 
endonuclease site added for cloning purposes. The remaining nucleotides of the 5' 
primer, 5*-GATCAAGTCGATGT-3' correspond to the first 4 amino acids of 
Usui II (Chua £1 aL ( 1 990) Int. Arch. Allergy Clin. Immunol 9_1: 1 1 8- 1 23) as 
20 described in Trudinger el aL ((1991) Clin. F.xp. Allergy 21:33-37). The 3' primer, 
which has the sequence S'-TTGACACCAGACCAACTGGTAATG-S 1 , 
corresponds to a sequence of the ggtlO cloning vector (Trudinger ej aL supra V 
PCR was performed as described (Trudinger el aL supra) and four 
DerJTI clones, MT3, MT5, MT16 and MT18, were sequenced, as shown in 
25 Figure 20. Three clones were sequenced that had potential polymorphisms 

relative to the published Per f II sequence (Trudinger si aL supra). The codon for 
amino acid 52 of clone MT1 8 was ATT rather than the published ACT (Trudinger 
Si aL supra). This change in codon 52 of clone MT18 would result in a predicted 
amino acid change from Thr to He. Clone MT5 contained three changes from the 
30 published sequence (Trudinger el aL Slipia): ( 1 ) the codon for amino acid 1 1 was 
AGC rather than the published AAC (Trudinger el aL supra), which results in a 
predicted amino acid substitution of Ser for Asn; (2) the codon for amino acid 52 
was ATT, rather than the published ACT (Trudinger el aL supra V which results in 
a predicted amino acid substitution of He for Thr; and (3) the codon for amino 
35 acid 88 was ATC rather than the published GCC (Trudinger el aL supra), which 
results in a predicted amino acid substitution of He for Ala. Clone MT16 had a 
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silent mutation in the codon for amino acid 68 (ATC versus the published ATT 
(Trudinger et aL supra ) that did not change the predicted amino acid at this 
residue. The following substitutes were also observed by Yuuki et aL* 
f Jpn.J.Allergol. fi:557-561, 1990); He at residue 52, lie at residue 54 and He at 
5 residue 88. 

Equivalents 

Those skilled in the art will recognize, or be able to ascertain using no 
more than routine experimentation, many equivalents to the specific embodiments 
10 of the invention described herein. Such equivalents are intended to be 
encompassed by the following claims. 
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SEQUENCE LISTING 



10 



15 



20 



(1) GENERAL INFORMATION: 

(i) (i) APPLICANT: 

(A) NAME: IMMULOGIC PHARMACEUTICAL CORPORATION 

(B) STREET: 610 LINCOLN STREET 

(C) CITY: WALTHAM 

(D) STATE: MASSACHUSETTS 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 02154 

(G) TELEPHONE: (617) 466-6000 

(H) TELEFAX: (617) 466-6010 

(ii) TITLE OF INVENTION : CLONING AND SEQUENCING OF ALLERGENS FROM 

DERMATO PHAGO I DES (HOUSE DUST MITES) 

(iii) NUMBER OF SEQUENCES: 13 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: LAHIVE & COCKFIELD 

(B) STREET: 60 STATE STREET, SUITE 510 

(C) CITY: BOSTON 
25 (D) STATE: MA 

( E ) COUNTRY : USA 

(F) ZIP: 02109 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

(D) SOFTWARE: ASCII TEXT 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 
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(vii) PRIOR APPLICATION DATA: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/945,288 

(B) FILING DATE: 10 SEPTEMBER 1992 

5 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 227-7400 

(B) TELEFAX: (617) 227-5941 

10 (2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 



20 



25 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..738 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 1 : 



AAA AAC CGA TTT TTG ATG AGT GCA GAA GCT TTT GAA CAC CTC AAA ACT 4 8 

Lys Asn Arg Phe Leu Met Ser Ala Glu Ala Phe Glu His Leu Lys Thr 
30 -23 -20 -15 -10 

CAA TTC GAT TTG AAT GCT GAA ACT AAC GCC TGC AGT ATC AAT GGA AAT 96 
Gin Phe Asp Leu Asn Ala Glu Thr Asn Ala Cys Ser He Asn Gly Asn 
-5 -11 5 

35 
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GCT CCA GCT GAA ATC GAT TTG CGA CAA ATG CGA ACT GTC ACT CCC ATT 
Ala Pro Ala Glu lie Asp Leu Arg Gin Met Arg Thr Val Thr Pro lie 
10 



144 



CGT ATG CAA GGA GGC TGT GGT TCA TGT TGG GCT TTC TCT GGT GTT GCC 
Arg Met Gin Gly Gly Cys Gly Ser Cys Trp Ala Phe Ser Gly Val Ala 
30 35 40 



192 



10 



GCA ACT GAA TCA GCT TAT TTG GCT CAC CGT AAT CAA TCA TTG GAT CTT 
Ala Thr Glu Ser Ala Tyr Leu Ala His Arg Asn Gin Ser Leu Asp Leu 
45 50 55 



240 



15 



GCT GAA CAA GAA TTA GTC GAT TGT GCT TCC CAA CAC GGT TGT CAT GGT 
Ala Glu Gin Glu Leu Val Asp Cys Ala Ser Gin His Gly Cys His Gly 
60 65 70 



288 



20 



GAT ACC ATT CCA CGT GGT ATT GAA TAC ATC CAA CAT AAT GGT GTC GTC 
Asp Thr He Pro Arg Gly lie Glu Tyr He Gin His Asn Gly Val Val 
75 80 85 

CAA GAA AGC TAC TAT CGA TAC GTT GCA CGA GAA CAA TCA TGC CGA CGA 
Gin Glu Ser Tyr Tyr Arg Tyr Val Ala Arg Glu Gin Ser Cys Arg Arg 
90 95 100 105 

25 CCA AAT GCA CAA CGT TTC GGT ATC TCA AAC TAT TGC CAA ATT TAC CCA 
Pro Asn Ala Gin Arg Phe Gly He Ser Asn Tyr Cys Gin He Tyr Pro 
110 H5 120 



336 



384 



432 



30 



CCA AAT GCA AAC AAA ATT CGT GAA GCT TTG GCT CAA ACC CAC AGC GCT 
Pro Asn Ala Asn Lys He Arg Glu Ala Leu Ala Gin Thr His Ser Ala 
!25 130 135 



480 



35 



ATT GCC GTC ATT ATT GGC ATC AAA GAT TTA GAC GCA TTC CGT CAT TAT 528 
He Ala Val He He Gly He Lys Asp Leu Asp Ala Phe Arg His Tyr 
140 145 150 
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GAT GGC CGA ACA ATC ATT CAA CGC GAT AAT GGT TAG CAA CCA AAC TAT 
Asp Gly Arg Thr He He Gin Arg Asp Asn Gly Tyr Gin Pro Asn Tyr 
155 160 165 



576 



5 CAC GCT GTC AAC ATT GTT GGT TAC AGT AAC GCA CAA GGT GTC GAT TAT 
His Ala val Asn He Val Gly Tyr Ser Asn Ala Gin Gly Val Asp Tyr 
170 175 180 185 



624 



TGG ATC GTA CGA AAC AGT TGG GAT ACC AAT TGG GGT GAT AAT GGT TAC 
10 Trp He Val Arg Asn Ser Trp Asp Thr Asn Trp Gly Asp Asn Gly Tyr 

190 195 200 



672 



GGT TAT TTT GCT GCC AAC ATC GAT TTG ATG ATG ATT GAA GAA TAT CCA 
Gly Tyr Phe Ala Ala Asn He Asp Leu Met Met He Glu Glu Tyr Pro 
15 205 210 215 



720 



20 



TAT GTT GTC ATT CTC TAAACAAAAA GACAATTTCT TATATGATTG TCACTAATTT 775 
Tyr Val Val He Leu 
220 

ATTTAAAATC AAAATTTTTT AGAAAATGAA TAAATTCATT CACAAAAATT AAAAAAAAA 834 



25 



30 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 245 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
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Lys Asn Arg Phe Leu Met Ser Ala Glu Ala Phe Glu His Leu Lys Thr 
-23 -20 



-15 



-10 



5 Gin Phe Asp Leu Asn Ala Glu Thr Asn Ala Cys Ser He Asn Gly Asn 
-5 -i ! . 



Ala Pro Ala Glu He Asp Leu Arg Gin Met Arg Thr Val Thr Pro He 
10 15 



20 



10 



25 



Arg Met Gin Gly Gly Cys Gly Ser Cys Trp Ala Phe Ser Gly Val Ale 
30 3 5 



40 



15 



Ala Thr Glu Ser Ala Tyr Leu Ala His Arg Asn Gin -Ser Leu Asp Leu 



45 



50 



55 



Ala Glu Gin Glu Leu Val Asp Cys Al 



a Ser Gin His Gly Cys His Gly 



60 



65 



70 



20 Asp Thr He Pro Arg Gly He Glu 



Tyr He Gin His Asn Gly Val Val 



75 



80 



85 



Gin Glu Ser Tyr Tyr Arg Tyr Val Ala Arg Glu Gin Ser Cys Arg Arg 
90 95 



100 



25 



105 



Pro Asn Ala Gin Arg Phe Gly He Ser Asn Tyr Cys Gin He Tyr Pro 
110 115 



120 



Pro Asn Ala Asn Lys He Arg Glu Ala Leu Ala Gin Thr His Ser Ala 
30 125 



130 



135 



He Ala Val He He Gly He Lys Asp Leu Asp Ala Phe Arg His Tyr 
140 145 



150 
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Asp Gly Arg Thr He 
155 

His Ala Val Asn He 
5 170 

Trp He Val Arg Asn 
190 

10 Gly Tyr Phe Ala Ala 
205 



-40- 

Ile Gin Arg Asp Asn Gly 
160 

Val Gly Tyr Ser Asn Ala 
175 180 

Ser Trp Asp Thr Asn Trp 
195 

Asn He Asp Leu Met Met 
210 
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Tyr Gin Pro Asn Tyr 
165 

Gin Gly Val Asp Tyr 
185 

Gly Asp Asn Gly Tyr 
200 

He Glu Glu Tyr Pro 
215 



Tyr Val Val He Leu 
220 



15 



(2) INFORMATION FOR SEQ ID NO:3: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 588 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



25 (ii) MOLECULE TYPE: CDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 69. .509 

30 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



CACAAATTCT TCTTTCTTCC TTACTACTGA TCATTAATCT GAAAACAAAA CCAAACAAAC 6 0 



CATTCAAA ATG ATG TAC AAA ATT TTG TGT CTT TCA TTG TTG GTC GCA GCC 
Met Tyr Lys He Leu Cys Leu Ser Leu Leu Val Ala Ala 
-16 -15 -io _ 5 



110 



10 



GTT GCT CGT GAT CAA GTC GAT GTC AAA GAT TGT GCC AAT CAT GAA ATC 
Val Ala Arg Asp Gin Val Asp Val Lys Asp Cys Ala Asn His Glu He 
-11 5 io 



158 



15 



AAA AAA GTT TTG GTA CCA GGA TGC CAT GGT TCA GAA CCA TGT ATC ATT 
Lys Lys Val Leu Val Pro. Gly Cys His Gly Ser Glu Pro Cys He He 
15 20 25 



206 



20 



CAT CGT GGT AAA CCA TTC CAA TTG GAA GCC GTT -TTC GAA GCC AAC CAA 2 54 

His Arg Gly Lys Pro Phe Gin Leu Glu Ala Val Phe Glu Ala Asn Gin 
30 35 40 45 

AAC ACA AAA ACG GCT AAA ATT GAA ATC AAA GCC TCA ATC GAT GGT TTA 3 02 

Asn Thr Lys Thr Ala Lys He Glu He Lys Ala Ser He Asp Gly Leu 
50 55 60 



25 



GAA GTT GAT GTT CCC GGT ATC GAT CCA AAT GCA TGC CAT TAC ATG AAA 
Glu Val Asp Val Pro Gly He Asp Pro Asn Ala Cys His Tyr Met Lys 
65 70 75 



350 



30 



TGC CCA TTG GTT AAA GGA CAA CAA TAT GAT ATT AAA TAT ACA TGG AAT 
Cys Pro Leu Val Lys Gly Gin Gin Tyr Asp He Lys Tyr Thr Trp Asn 
80 85 90 



398 
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GTT CCG AAA ATT GCA CCA AAA TCT GAA AAT GTT GTC GTC ACT GTT AAA 44 6 

Val Pro Lys He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys 
9S 100 105 

5 GTT ATG GGT GAT GAT GGT GTT TTG GCC TGT GCT ATT GCT ACT CAT GCT 4 94 

Val Met Gly Asp Asp Gly Val Leu Ala Cys Ala He Ala Thr His Ala 
110 115 120 125 

AAA ATC CGC GAT TAAATAAACA AAATTTATTG ATTTTGTAAT CACAAATGAT 546 
10 Lys He Arg Asp 

TGATTTTCTT TCCAAAAAAA AAATAAATAA AATTTTGGGA AT 58 8 



(2) INFORMATION FOR SEQ ID NO : 4 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 6 amino acids 
20 (B) TYPE: amino acid 

(D) TOPOLOGY : linear 

<ii) MOLECULE TYPE: protein 

25 (xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 4 : 

Met Met Tyr Lys He Leu Cys Leu Ser Leu Leu Val Ala Ala Val Ala 
-16 -15 -10 - 5 

30 Arg Asp Gin Val Asp Val Lys Asp Cys Ala Asn His Glu He Lys Lys 
-11 5 10 15 

Val Leu Val Pro Gly Cys His Gly Ser Glu Pro Cys He He His Arg 
20 25 30 
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Gly Lys Pro Phe Gin Leu Glu Ala Val Phe Glu Ala Asn Gin Asn Thr 

35 40 45 

5 Lys Thr Ala Lys He Glu He Lys Ala Ser He Asp Gly Leu Glu Val 

50 55 60 



10 



Asp Val Pro Gly He Asp Pro Asn Ala Cys His Tyr Met Lys Cys Pro 
65 70 75 

Leu Val Lys Gly Gin Gin Tyr Asp He Lys Tyr Thr Trp Asn Val Pro 
80 85 90 95 



Lys He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Val Met 
15 100 105 



110 



Gly Asp Asp Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys He 
115 120 125 



20 Arg Asp 



25 



30 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1072 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE : cDNA 
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(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 36.. 1001 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



CGTTTTCTTC CATCAAAATT AAAAATTCAT CAAAA ATG AAA TTC GTT TTG GCC 53 

Met Lys Phe Val Leu Ala 

10 " 98 " 95 

ATT GCC TCT TTG TTG GTA TTG AGC ACT GTT TAT GCT CGT CCA GCT TCA 101 
He Ala Ser Leu Leu Val Leu Ser Thr Val Tyr Ala Arg Pro Ala Ser 
-90 -85 -80 

15 

ATC AAA ACT TTT GAA GAA TTC AAA AAA GCC TTC AAC AAA AAC TAT GCC 14 9 

He Lys Thr Phe Glu Glu Phe Lys Lys Ala Phe Asn Lys Asn Tyr Ala 
-75 -70 -65 

20 ACC GTT GAA GAG GAA GAA GTT GCC CGT AAA AAC TTT TTG GAA TCA TTG 197 
Thr Val Glu Glu Glu Glu Val Ala Arg Lys Asn Phe Leu Glu Ser Leu 
-60 -55 -50 -45 

AAA TAT GTT GAA GCT AAC AAA GGT GCC ATC AAC CAT TTG TCC GAT TTG 245 
25 Lys Tyr Val Glu Ala Asn Lys Gly Ala He Asn His Leu Ser Asp Leu 

-40 -35 -30 

TCA TTG GAT GAA TTC AAA AAC CGT TAT TTG ATG AGT GCT GAA GCT TTT 2 93 

Ser Leu Asp Glu Phe Lys Asn Arg Tyr Leu Met Ser Ala Glu Ala Phe 
30 -25 -20 -15 

GAA CAA CTC AAA ACT CAA TTC GAT TTG AAT GCC GAA ACA AGC GCT TGC 341 
Glu Gin Leu Lys Thr Gin Phe Asp Leu Asn Ala Glu Thr Ser Ala Cys 
-10 -5 -1 1 
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CGT ATC AAT TCG GTT AAC GTT CCA TCG GAA TTG GAT TTA CGA TCA CTG 
Arg He Asn Ser Val Asn Val Pro Ser Glu Leu Asp Leu Arg Ser Leu 
5 10 15 20 



389 



CGA ACT GTC ACT CCA ATC CGT ATG CAA GGA GGC TGT GGT TCA TGT TGG 
Arg Thr Val Thr Pro He Arg Met Gin Gly Gly Cys Gly Ser Cys Trp 
25 30 35 

10 GCT TTC TCT GGT GTT GCC GCA ACT GAA TCA GCT TAT TTG GCC TAC CGT 
Ala Phe Ser Gly Val Ala Ala Thr Glu Ser Ala Tyr Leu Ala Tyr Arg 
40 45 50 



437 



485 



15 



AAC ACG TCT TTG GAT CTT TCT GAA CAG GAA CTC GTC GAT TGC GCA TCT 
Asn Thr Ser Leu Asp Leu Ser Glu Gin Glu Leu Val Asp Cys Ala Ser 
55 60 65 



533 



20 



CAA CAC GGA TGT CAC GGC GAT ACA ATA CCA AGA GGC ATC GAA TAC ATC 
Gin His Gly Cys His Gly Asp Thr He Pro Arg Gly He Glu Tyr He 
70 75 80 



581 



25 



CAA CAA AAT GGT GTC GTT GAA GAA AGA AGC TAT CCA TAC GTT GCA CGA 
Gin Gin Asn Gly Val Val Glu Glu Arg Ser Tyr Pro Tyr Val Ala Arg 

^5 go oc 

95 100 



GAA CAA CGA TGC CGA CGA CCA AAT TCG CAA CAT TAC GGT ATC TCA AAC 
Glu Gin Arg Cys Arg Arg Pro Asn Ser Gin His Tyr Gly He Ser Asn 
105 HO 115 

30 TAC TGC CAA ATT TAT CCA CCA GAT GTG AAA CAA ATC CGT GAA GCT TTG 
Tyr Cys Gin He Tyr Pro Pro Asp Val Lys Gin He Arg Glu Ala Leu 
120 125 130 



629 



677 



725 
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ACT CAA ACA CAC ACA GCT ATT GCC GTC ATT ATT GGC ATC AAA GAT TTG 77 3 

Thr Gin Thr His Thr Ala lie Ala Val lie lie Gly lie Lys Asp Leu 
135 140 145 

5 AGA GCT TTC CAA CAT TAT GAT GGA CGA ACA ATC ATT CAA CAT GAC AAT 821 
Arg Ala Phe Gin His Tyr Asp Gly Arg Thr lie lie Gin His Asp Asn 
150 155 160 

GGT TAT CAA CCA AAC TAT CAT GCC GTC AAC ATT GTC GGT TAC GGA AGT 86 9 

10 Gly Tyr Gin Pro Asn Tyr His Ala Val Asn lie Val Gly Tyr Gly Ser 
165 170 175 180 

ACA CAA GGC GAC GAT TAT TGG ATC GTA CGA AAC AGT TGG GAT ACT ACC 917 
Thr Gin Gly Asp Asp Tyr Trp lie Val Arg Asn Ser Trp Asp Thr Thr 
15 185 190 195 

TGG GGA GAT AGC GGA TAC GGA TAT TTC CAA GCC GGA AAC AAC CTC ATG 965 
Trp Gly Asp Ser Gly Tyr Gly Tyr Phe Gin Ala Gly Asn Asn Leu Met 
200 205 210 

20 

ATG ATC GAA CAA TAT CCA TAT GTT GTA ATC ATG TGAACATTTG AAATTGAATA 1018 
Met lie Glu Gin Tyr Pro Tyr Val Val lie Met 
215 220 

25 TATTTATTTG TTTTCAAAAT AAAAACAACT ACTCTTGCGA GTATTTTTTA CTCG 1072 



(2) INFORMATION FOR SEQ ID NO : 6 : 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



5 Met Lys Phe Val Leu Ala lie Ala Ser Leu Leu Val Leu Ser Thr Val 
-98 -95 -90 



-85 



10 



Tyr Ala Arg Pro Ala Ser He Lys Thr Phe Glu Glu Phe Lys Lys Ala 
"80 -75 -70 

Phe Asn Lys Asn Tyr Ala Thr Val Glu Glu Glu Glu Val Ala Arg Lys 
-65 -SO -55 



Asn Phe Leu Glu Ser Leu Lys Tyr Val Glu Ala Asn Lys Gly Ala He 
15 -50 -45 



-40 



-35 



Asn His Leu Ser Asp Leu Ser Leu Asp Glu Phe Lys Asn Arg Tyr Leu 
-30 -25 



-20 



20 Met Ser Ala Glu Ala Phe Glu Gin 



Leu Lys Thr Gin Phe Asp Leu Asn 
-15 -io - 5 



Ala Glu Thr Ser Ala Cys Arg He Asn Ser Val Asn Val Pro Ser Glu 
-1 1 5 



10 



25 



Leu Asp Leu Arg Ser Leu Arg Thr Val Thr Pro He Arg Met Gin Gly 

15 20 25 30 



Gly Cys Gly Ser Cys Trp Ala Phe Ser Gly Val Ala Ala Thr Glu Ser 
30 35 40 



45 



Ala Tyr Leu Ala Tyr Arg Asn Thr Ser Leu Asp Leu Ser Glu Gin Glu 
50 55 eo 
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Leu Val Asp Cys Ala Ser Gin His Gly Cys His Gly Asp Thr lie Pro 
65 ™ 75 

Arg Gly He Glu Tyr He Gin Gin Asn Gly Val Val Glu Glu Arg Ser 
80 85 9° 



Tyr Pro Tyr Val Ala Arg 



Glu Gin Arg Cys Arg Arg Pro Asn Ser Gin 



95 100 HO 

10 His Tyr Gly He Ser Asn Tyr Cys Gin He Tyr Pro Pro Asp Val Lys 

115 120 125 

Gin He Arg Glu Ala Leu Thr Gin Thr His Thr Ala He Ala Val He 
130 135 140 

15 

He Gly He Lys Asp Leu Arg Ala Phe Gin His Tyr Asp Gly Arg Thr 
145 150 155 

He He Gin His Asp Asn Gly Tyr Gin Pro Asn Tyr His Ala Val Asn 
20 160 165 170 



He Val Gly Tyr 



Gly Ser Thr Gin Gly Asp Asp Tyr Trp He Val Arg 
175 180 185 190 

25 Asn Ser Trp Asp Thr Thr Trp Gly Asp Ser Gly Tyr Gly Tyr Phe Gin 

195 200 205 

Ala Gly Asn Asn Leu Met Met He Glu Gin Tyr Pro Tyr Val Val He 
210 215 220 

30 

Met 
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(2) INFORMATION FOR SEQ ID NO : 7 : 



10 



15 



-49- 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 491 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .390 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 



PCT/US93/08518 



20 



GAT CAA GTC GAT GTT AAA GAT TGT GCC AAC AAT GAA ATC AAA AAA GTA 
Asp Gin Val Asp Val Lys Asp Cys Ala Asn Asn Glu lie Lys Lys Val 
1 5 io 15 



48 



25 



ATG GTC GAT GGT TGC CAT GGT TCT GAT CCA TGC ATA ATC CAT CGT GGT 
Met Val Asp Gly Cys His Gly Ser Asp Pro Cys He He His Arg Gly 
20 25 30 



96 



30 



AAA CCA TTC ACT TTG GAA GCC TTA TTC GAT GCC AAC CAA AAC ACT AAA 
Lys Pro Phe Thr Leu Glu Ala Leu Phe Asp Ala Asn Gin Asn Thr Lys 
35 40 45 

ACC GCT AAA ACT GAA ATC AAA GCC AGC CTC GAT GGT CTT GAA ATT GAT 
Thr Ala Lys Thr Glu He Lys Ala Ser Leu Asp Gly Leu Glu He Asp 
50 55 60 



144 



192 
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GTT CCC GGT ATT GAT ACC AAT GCT TGC CAT TTT ATG AAA TGT CCA TTG 24 0 

Val Pro Gly He Asp Thr Asn Ala Cys His Phe Met Lys Cys Pro Leu 
65 70 75 "80 

5 GTT AAA GGT CAA CAA TAT GAT GCC AAA TAT ACA TGG AAT GTG CCC AAA 2 88 

Val Lys Gly Gin Gin Tyr Asp Ala Lys Tyr Thr Trp Asn Val Pro Lys 
85 90 95 

ATT GCA CCA AAA TCT GAA AAC GTT GTC GTT ACA GTC AAA CTT GTT GGT 3 36 

10 He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Leu Val Gly 
100 105 HO 

GAT AAT GGT GTT TTG GCT TGC GCT ATT GCT ACC CAC GCT AAA ATC CGT 3 84 

Asp Asn Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys He Arg 
15 115 120 125 

GAT TAAAAAAAAA AAATAAATAT GAAAATTTTC ACCAACATCG AACAAAATTC 437 
Asp 



130 



AATAACCAAA ATTTGAATCA AAAACGGAAT TCCAAGCTGA GCGCCGGTCG CTAC 4 91 



(2) INFORMATION FOR SEO ID NO: 8: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 129 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

30 

(ii) MOLECULE TYPE: protein 
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(Xi) SEQUENCE DESCRIPTION : SEQ ID NO : 8 



Asp Gin Val Asp Val Lys Asp Cys Ala Asn Asn Glu He Lys Lys Val 
1 5 10 



15 



Met Val Asp Gly Cys His Gly Ser Asp Pro Cys He lie His Arg Gly 
20 25 30 

Lys Pro Phe Thr Leu Glu Ala Leu Phe Asp Ala Asn Gin Asn Thr Lys 
10 35 40 45 

Thr Ala Lys Thr Glu He Lys Ala Ser Leu Asp Gly Leu Glu He Asp 
50 55 60 



15 Val Pro Gly He Asp Thr Asn Al 



a Cys His Phe Met Lys Cys Pro Leu 



65 



70 



75 



80 



Val Lys Gly Gin Gin Tyr Asp Ala Lys Tyr Thr Trp Asn Val Pro Lys 
85 90 



20 



95 



He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Leu Val Gly 



100 



105 



110 



Asp Asn Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys He Arg 
25 H5 120 



12 5 



Asp 
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15 



20 



30 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1172 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 



( ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1. .73 8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:' 

GAATTCCTTT TTTTTTCTTT CTCTCTCTAA AATCTAAAAT CCATCCAAC ATG AAA ATT 58 

Met Lys lie 
-98 



GTT TTG GCC ATC GCC TCA TTG TTG GCA TTG AGC GCT GTT TAT GCT CGT 106 
Thr Leu Ala lie Ala Ser Leu Leu Ala Leu Ser Ala Val Tyr Ala Arg 
25 -95 -90 -85 -80 

CCA TCA TCG ATC AAA ACT TTT GAA GAA TAC AAA AAA GCC TTC AAC AAA 154 
Pro Ser Ser lie Lys Thr Phe Glu Glu Tyr Lys Lys Ala Phe Asn Lys 
-75 -70 -65 



AGT TAT GCT ACC TTC GAA GAT CAA GAA GCT GCC CGT AAA AAC TTT TTG 2 02 

Ser Tyr Ala Thr Phe Glu Asp Glu Glu Ala Ala Arg Lys Asn Phe Leu 
-60 -55 -50 
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GAA TCA GTA AAA TAT GTT CAA TCA AAT GGA GGT GCC ATC AAC CAT TTG 250 
Glu Ser Val Lys Tyr Val Gin Ser Asn Gly Gly Ala lie Asn His Leu 
-45 -40 -35 

5 TCC GAT TTG TCG TTG GAT GAA TTC AAA AAC CGA TTT TTG ATG AGT GCA 2 98 

Ser Asp Leu Ser Leu Asp Glu Phe Lys Asn Arg Phe Leu Met Ser Ala 
-30 -25 -20 

GAA GCT TTT GAA CAC CTC AAA ACT CAA TTC GAT TTG AAT GCT GAA ACT 34 6 

10 Glu Ala Phe Glu His Leu Lys Thr Gin Phe Asp Leu Asn Ala Glu Thr 
-15 -10 -5 -1 1 

AAC GCC TGC AGT ATC AAT GGA AAT GCT CCA GCT GAA ATC GAT TTG CGA 3 94 

Asn Ala Cys Ser He Asn Gly Asn Ala Pro Ala Glu He Asp Leu Arg 
15 5 10 15 

CAA ATG CGA ACT GTC ACT CCC ATT CGT ATG CAA GGA GGC TGT GGT TCA 442 
Gin Met Arg Thr Val Thr Pro He Arg Met Gin Gly Gly Cys Gly Ser 
20 25 30 

20 

TGT TGG GCT TTC TCT GGT GTT GCC GCA ACT GAA TCA GCT TAT TTG GCT 4 90 

Cys Trp Ala Phe Ser Gly Val Ala Ala Thr Glu Ser Ala Tyr Leu Ala 
35 40 45 

25 CAC CGT AAT CAA TCA TTG GAT CTT GCT GAA CAA GAA TTA GTC GAT TGT 53 8 

His Arg Asn Gin Ser Leu Asp Leu Ala Glu Gin Glu Leu Val Asp Cys 
50 55 60 65 

GCT TCC CAA CAC GGT TGT CAT GGT GAT ACC ATT CCA CGT GGT ATT GAA 586 
30 Ala Ser Gin His Gly Cys His Gly Asp Thr He Pro Arg Gly He Glu 

70 75 80 

TAC ATC CAA CAT AAT GGT GTC GTC CAA GAA AGC TAC TAT CGA TAC GTT 634 
Tyx He Gin His Asn Gly Val Val Gin Glu Ser Tyr Tyr Arg Tyr Val 
35 85 90 95 
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GCA CGA GAA CAA TCA TGC CGA CGA CCA AAT GCA CAA CGT TTC GGT ATC 6 82 

Ala Arg Glu Gin Ser Cys Arg Arg Pro Asn Ala Gin Arg Phe Gly lie 
100 105 110 

5 TCA AAC TAT TGC CAA ATT TAC CCA CCA AAT GCA AAC AAA ATT CGT GAA 73 0 

Ser Asn Tyr Cys Gin lie Tyr Pro Pro Asn Ala Asn Lys lie Arg Glu 
115 120 125 

GCT TTG GCT CAA ACC CAC AGC GCT ATT GCC GTC ATT ATT GGC ATC AAA 77 8 

10 Ala Leu Ala Gin Thr His Ser Ala He Ala Val He lie Gly He Lys 
130 135 140 145 

GAT TTA GAC GCA TTC CGT CAT TAT GAT GGC CGA ACA ATC ATT CAA CGC 826 
Asp Leu Asp Ala Phe Arg His Tyr Asp Gly Arg Thr He He Gin Arg 
15 150 155 160 

GAT AAT GGT TAC CAA CCA AAC TAT CAC GCT GTC AAC ATT GTT GGT TAC 874 
Asp Asn Gly Tyr Gin Pro Asn Tyr His Ala Val Asn He Val Gly Tyr 
165 170 175 

20 

AGT AAC GCA CAA GGT GTC GAT TAT TGG ATC GTA CGA AAC AGT TGG GAT 922 
Ser Asn Ala Gin Gly Val Asp Tyr Trp He Val Arg Asn Ser Trp Asp 
180 185 190 

25 ACC AAT TGG GGT GAT AAT GGT TAC GGT TAT TTT GCT GCC AAC ATC GAT 970 
Thr Asn Trp Gly Asp Asn Gly Tyr Gly Tyr Phe Ala Ala Asn He Asp 
195 200 205 

TTG ATG ATG ATT GAA GAA TAT CCA TAT GTT GTC ATT CTC TAAACAAAAA 1019 
30 Leu Met Met He Glu Glu Tyr Pro Tyr Val Val He Leu 
210 215 220 

GACAATTTCT TATATGATTG TCACTAATTT ATTTAAAATC AAAATTTTTA GAAAATGAAT 107 9 
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-!>:>- 

AAATTCATTC ACAAAAATTA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 113 9 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 1172 

(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 0 amino acids 
10 (B) TYPE: amino acid 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

*5 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 10: 

Met Lys He Thr Leu Ala He Ala Ser Leu Leu -Ala Leu Ser Ala Val 
-98 -95 -90 -85 

20 Tyr Ala Arg Pro Ser Ser He Lys Thr Phe Glu Glu Tyr Lys Lys Ala 
-80 -75 -70 

Phe Asn Lys Ser Tyr Ala Thr Phe Glu Asp Glu Glu Ala Ala Arg Lys 
-65 -60 -55 

25 

Asn Phe Leu Glu Ser Val Lys Tyr Val Gin Ser Asn Gly Gly Ala He 
-50 -45 -40 -35 

Asn His Leu Ser Asp Leu Ser Leu Asp Glu Phe Lys Asn Arg Phe Leu 
30 -30 -25 -20 

Met Ser Ala Glu Ala Phe Glu His Leu Lys Thr Gin Phe Asp Leu Asn 
-15 -10 -5 
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Ala Glu Thr Asn Ala Cys Ser lie Asn Gly Asn Ala Pro Ala Glu lie 
-11 5 10 



Asp Leu Arg Gin Met Arg Thr Val Thr Pro lie Arg Met Gin Gly Gly 
5 15 20 25 30 

Cys Gly Ser Cys Trp Ala Phe Ser Gly Val Ala Ala Thr Glu Ser Ala 

35 40 45 

10 Tyr Leu Ala His Arg Asn Gin Ser Leu Asp Leu Ala Glu Gin Glu Leu 

50 55 60 

Val Asp Cys Ala Ser Gin His Gly Cys His Gly Asp Thr lie Pro Arg 
65 70 75 

15 

Gly lie Glu Tyr lie Gin His Asn Gly Val Val Gin Glu Ser Tyr Tyr 
80 85 90 

Arg Tyr Val Ala Arg Glu Gin Ser Cys Arg Arg Pro Asn Ala Gin Arg 
20 95 100 105 110 

Phe Gly lie Ser Asn Tyr Cys Gin lie Tyr Pro Pro Asn Ala Asn Lys 
115 120 125 

25 lie Arg Glu Ala Leu Ala Gin Thr His Ser Ala lie Ala Val lie He 
130 135 140 

Gly He Lys Asp Leu Asp Ala Phe Arg His Tyr Asp Gly Arg Thr He 
145 150 155 

30 

lie Gin Arg Asp Asn Gly Tyr Gin Pro Asn Tyr His Ala Val Asn He 
160 165 170 



BNSDOCID: <WO 9405790A1> 



WO 94/05790 PCT/US93/08518 

>• • 

Val Gly Tyr Ser Asn Ala Gin Gly Val Asp Tyr Trp He Val Arg Asn 
175 180 185 190 

Ser Trp Asp Thr Asn Trp Gly Asp Asn Gly Tyr Gly Tyr Phe Ala Ala 
5 195 200 205 

Asn He Asp Leu Met Met He Glu Glu Tyr Pro Tyr Val Val He Leu 
210 215 220 

10 

(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 
*5 (B) TYPE: amino acid 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

20 (ix) FEATURE: 

(A) NAME/KEY: misc.feature 

(B) LOCATION: 50 

(D) OTHER INFORMATION: /label=Xaa is His or Tyr 

25 (ix) FEATURE: 

(A) NAME/KEY: misc.feature 

(B) LOCATION: 81 

(D) OTHER INFORMATION: /label=Xaa is Glu or Lys 

30 (ix) FEATURE: 

(A) NAME/ KEY : misc.feature 

(B) LOCATION: 124 

(D) OTHER INFORMATION: /label=Xaa is Ala or Val 
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(ix) FEATURE: 

(A) NAME /KEY : misc.feature 

(B) LOCATION: 13 6 

(D) OTHER INFORMATION: /label=Xaa is Ser or Thr 

(ix) FEATURE: 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 215 

(D) OTHER INFORMATION: /label=Xaa is Glu or Gin 



10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 11 : 

Thr Asn Ala Cys Ser lie Asn Gly Asn Ala Pro Ala Glu lie Asp Leu 
15 1 5 10 15 

Arg Gin Met Arg Thr Val Thr Pro lie Arg Met Gin Gly Gly Cys Gly 
20 25 30 

20 Ser Cys Trp Ala Phe Ser Gly Val Ala Ala Thr Glu Ser Ala Tyr Leu 
35 40 45 

Ala Xaa Arg Asn Gin Ser Leu Asp Leu Ala Glu Gin Glu Leu Val Asp 
50 55 60 

25 

Cys Ala Ser Gin His Gly Cys His Gly Asp Thr lie Pro Arg Gly lie 
65 70 75 80 

Xaa Tyr lie Gin His Asn Gly Val Val Gin Glu Ser Tyr Tyr Arg Tyr 
30 85 90 95 

Val Ala Arg Glu Gin Ser Cys Arg Arg Pro Asn Ala Gin Arg Phe Gly 
100 105 110 
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He Ser Asn Tyr Cys Gin He Tyr Pro Pro Asn Xaa Asn Lys He Arg 
H5 120 125 

Glu Ala Leu Ala Gin Thr His Xaa Ala He Ala Val He He Gly He 
5 130 135 140 

Lys Asp Leu Asp Ala Phe Arg His Tyr Asp Gly Arg Thr He He Gin 
145 150 155 160 

10 Arg Asp Asn Gly Tyr Gin Pro Asn Tyr His Ala Val Asn He Val Gly 

165 170 175 

Tyr Ser Asn Ala Gin Gly Val Asp Tyr Trp He Val Arg Asn Ser Trp 
180 185 • 190 

15 

Asp Thr Asn Trp Gly Asp Asn Gly Tyr Gly Tyr Phe Ala Ala Asn He 
195 200 205 

Asp Leu Met Met He Glu Xaa Tyr Pro Tyr Val Val He Leu 
20 210 215 220 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 129 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
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(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 4 7 

(D) OTHER INFORMATION: /label=Xaa is Thr or Ser 

5 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 114 

(D) OTHER INFORMATION: /label=OCaa is Asp or Asn 

10 

(ix) FEATURE : 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 127 

<D) OTHER INFORMATION: /label^Xaa .is lie or Leu 

15 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



Asp Gin Val Asp Val Lys Asp Cys Ala Asn His Glu lie Lys Lys Val 
15 10 15 

20 

Leu Val Pro Gly Cys His Gly Ser Glu Pro Cys He He His Arg Gly 



20 25 30 



Lys Pro Phe Gin Leu Glu Ala Val Phe Glu Ala Asn Gin Asn Xaa Lys 
25 35 40 45 

Thr Ala Lys lie Glu He Lys Ala Ser He Asp Gly Leu Glu Val Asp 
50 55 60 

30 Val Pro Gly He Asp Pro Asn Ala Cys His Tyr Met Lys Cys Pro Leu 
65 70 75 80 

Val Lys Gly Gin Gin Tyr Asp He Lys Tyr Thr Trp Asn Val Pro Lys 

85 90 95 

35 
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lie Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Val Met Gly 
100 105 110 

Asp Xaa Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys Xaa Arg 

5 115 120 125 

Asp 



10 (2) INFORMATION FOR SEQ ID NO: 13: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 129 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

FEATURE : 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 11 

(D) OTHER INFORMATION: /label=Xaa is Asn or Ser 

FEATURE : 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 52 

(D) OTHER INFORMATION: /label=Xaa is Thr or He 

(ix) FEATURE: 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 54 

(D) OTHER INFORMATION: /label=Xaa is He or Thr 



(i) 



15 



(ix) 



20 



(ix) 



25 
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(ix) FEATURE: 

(A) NAME /KEY : misc.feature 

(B) LOCATION: 76 

(D) OTHER INFORMATION: /label =Xaa is Met or Val 

( ix ) FEATURE : 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 8 8 

(D) OTHER INFORMATION: /label=Xaa is Ala or lie 



10 



15 



20 



{ ix ) FEATURE : 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 111 

(D) OTHER INFORMATION: /label=Xaa is Val or lie 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 13 : 

Asp Gin Val Asp Val Lys Asp Cys Ala Asn Xaa Glu lie Lys Lys Val 
15 10 15 

Met Val Asp Gly Cys His Gly Ser Asp Pro Cys lie lie His Arg Gly 
20 25 30 



Lys Pro Phe Thr Leu Glu Ala Leu Phe Asp Ala Asn Gin Asn Thr Lys 

25 3 5 4 0 4 5 

Thr Ala Lys Xaa Glu Xaa Lys Ala Ser Leu Asp Gly Leu Glu lie Asp 

50 55 60 

30 Val Pro Gly lie Asp Thr Asn Ala Cys His Phe Xaa Lys Cys Pro Leu 

65 70 75 80 

Val Lys Gly Gin Gin Tyr Asp Xaa Lys Tyr Thr Trp Asn Val Pro Lys 

85 90 95 

35 
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He Ala Pro Lys Ser Glu Asn Val 
100 

Asp Asn Gly Val Leu Ala Cys Ala 
5 H5 120 

Asp 



Val Val Thr Val Lys Leu Xaa Gly 
105 no 

He Ala Thr His Ala Lys He Arg 
125 
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CLAIMS 

1 . A protein allergen of Per p II comprising the amino acid sequence: 

5 Asp Gin Val Asp Val Lys Asp Cys Ala Asn His Glu He Lys Lys Val Leu Val Pro 
Gly Cys His Gly Ser Glu Pro Cys lie lie His Arg Gly Lys Pro Phe Gin Leu Glu 
Ala Val Phe Glu Ala Asn Gin Asn Xaai Lys Thr Ala Lys He Glu He Lys Ala Ser 
He Asp Gly Leu Glu Val Asp Val Pro Gly He Asp Pro Asn Ala Cys His Tyr Met 
Lys Cys Pro Leu Val Lys Gly Gin Gin Tyr Asp He Lys Tyr Thr Trp Asn Val Pro 
10 Lys lie Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Val Met Gly Xaa2 Asp 
Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys Xaa3 Arg Asp 

where Xaaj is selected from the group consisting of Thr and Ser; 
where Xaa2 is selected from the group consisting of Asp and Asn; 

15 and 

where Xaa3 is selected from the group consisting of He and Leu, 
except for the amino acid sequence where Xaai is Thr, Xaa2 is Asp 
and Xaa3 is He. 

20 2. A protein allergen of Dsr_f II comprising the amino acid sequence: 

Asp Gin Val Asp Val Lys Asp Cys Ala Asn Xaaj Glu lie Lys Lys Val Met Val 
Asp Gly Cys His Gly Ser Asp Pro Cys lie He His Arg Gly Lys Pro Phe Thr Leu 
Glu Ala Leu Phe Asp Ala Asn Gin Ash Thr Lys Thr Ala Lys Xaa2 Glu Xaa3 Lys 
25 Ala Ser Leu Asp Gly Leu Glu He Asp Val Pro Gly He Asp Thr Asn Ala Cys His 
Phe Xaa4 Lys Cys Pro Leu Val Lys Gly Gin Gin Tyr Asp Xaas Lys Tyr Thr Trp 
Asn Val Pro Lys He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Leu Xaag 
Gly Asp Asn Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys He Arg Asp 

30 where Xaai is selected from the group consisting of Asn and Ser; 

where Xaa2 is selected from the group consisting of Thr and He; 

where Xaa3 is selected from the group consisting of He and Thr; 

where Xaa4 is selected from the group consisting of Met and Val; 

where Xaas is selected from the group consisting of Ala and He; and 

35 where Xaa£ is selected from the group consisting of Val and He, with 

the proviso that, 

when Xaai 1S Asn > men Xaa 3 is Thr; 8X1(1 
when Xaa3 is He, then Xaai * s Ser. 
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3- A therapeutic composition comprising a protein allergen of claim 1 
and a pharmaceutically acceptable carrier or diluent. 

4- A method of treatment for sensitivity in an individual to house dust 
mites, comprising administering to the individual an effective therapeutic amount 
of a composition of claim 3. 

5- A therapeutic composition comprising a protein allergen of claim 2 
and a pharmaceutically acceptable carrier or diluent. 



A method of treatment for sensitivity in an individual to house dust 
mites, comprising administering to the individual an effective therapeutic amount 
of a composition of claim 5. 
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